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1 

VIRAL MATERIAL AND NUCLEOTIDE FRAGMENTS ASSOCIATED WITH 
MULTIPLE SCLEROSIS, FOR DIAGNOSTIC, PROPHYLACTIC AND 

THERAPEUTIC PURPOSES 

5 Multiple sclerosis (MS) is a demyelinating 

disease of the central nervous system (CNS) the cause of 
which remains as yet unknown, 

"Multiple sclerosis (MS) is the most common 
neurological disease of young adults with a prevalence in 

10 Europe and North America of between 20 and 200 per 
100,000. It is characterized clinically by a 
relapsing/remitting or chronic progressive course, 
frequently leading to severe disability. Current knowledge 
suggests that MS is associated with autoimmunity, that 

15 genetic background has an important influence and that 
"infectious" agent (s) may be involved. Indeed, many 
viruses have been proposed as possible candidates but as 
yet, none of them has been shown to play an aetiological 
role . 

20 Many studies have supported the hypothesis of a 

viral aetiology of the disease, but none of the known 
viruses tested has proved to be the causal agent sought: a 
review of the viruses sought for several years in MS has 
been compiled by E. Norrby (1) and R,T. Johnson (2) . 

25 The discovery of pathogenic retroviruses in man 

(HTLVs and HIVs) was followed by great interest in their 
ability to impair the immune system and to provoke central 
nervous system inflammation and/or degeneration. In the 
case of HTLV-1, its association with a chronic 

30 inflammatory demyelinating disease in man (48) led to 
extensive investigations to search for an HTLVl-like 
retrovirus in MS patients. However, despite initial 
claims, the presence of HTLV-1 or HTLV-like retroviruses 
was not confirmed. 
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Recently, a retrovirus different from the known 
human retroviruses has been isolated in patients suffering 
from MS (3 , 4 , and 5) . 

In 1989, the authors described the production of 
5 extracellular virions, associated with reverse 

transcriptase (RT) activity, by a culture of 
leptomeningeal cells (LM7) obtained from the cerebrospinal 
fluid of a patient with MS (3). This was followed by 
similar findings in monocyte cultures from a series of MS 

10 patients (5) . Neither viral particles nor viral RT- 
activity were found in control individuals. Furthermore, 
the authors were able to transfer the LM7 virus to non- 
infected leptomeningeal cells in vitro (26) . The molecular 
characterization of the "LM7" retrovirus was a 

15 prerequisite for further evaluation of its possible role 
in MS- Considerable difficulties arose from the absence of 
continuously productive retroviral cultures and from the 
low levels of expression in the few transient cultures. 
The strategy described here focused on RNA from 

20 extracellular virions, in order to avoid non-specific 
detection of cellular RNA and of endogenous elements from 
contaminating human DNA. A specific retroviral sequence 
associated with virions produced by cell cultures from 
several MS patients has been identified. The entire 

2 5 sequence of this novel retroviral genome is currently 
being obtained using RT-PCR on RNA from extracellular 
virions. The retrovirus previously called "L.M7 virus" 
corresponds to an oncovirus and is now designated MSRV 
(Multiple Sclerosis-associated Retrovirus) • 

30 The authors were also able to show that this 

retrovirus could be transmitted in vitro, that patients 
suffering from MS produced antibodies capable of 
recognizing proteins associated with the infection of 
leptomeningeal cells by this retrovirus, and that the 

35 expression of the latter could be strongly stimulated by 
the immediate-early genes of some herpesviruses (6) . 
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All these results point to the role in MS of at 
least one unknown retrovirus or of a virus having reverse 
transcriptase activity which is detectable according to 
the method published by H. Perron (3) and qualified as 
5 "LM7-like RT" activity. The content of the publication 
identified by (3) is incorporated in the present 
description by reference - 

Recently, the Applicant's studies have enabled 
two continuous cell lines infected with natural isolates 

10 originating from two different patients suffering from MS 
to be obtained by a culture method as described in the 
document WO-A-93 / 2 0 188 , the content of which is incorpor- 
ated in the present description by reference. These two 
lines, derived from human choroid plexus cells, designated 

15 LM7PC and PLI-2, were deposited with the ECACC on 
22nd July 1992 and 8th January 1993, respectively, under 
numbers 92072201 and 93010817, in accordance with the 
provisions of the Budapest Treaty. Moreover, the viral 
isolates possessing LM7-like RT activity were also 

20 deposited with the ECACC under the overall designation of 
"strains". The "strain" or isolate harboured by the PLI-2 
line, designated POL-2, was deposited with the ECACC on 
22nd July 1992 under No. V92072202. The "strain" or 
isolate harboured by the LM7PC line, designated MS7PG, was 

25 deposited with the ECACC on 8th January 1993 under 
No. V93010816. 

Starting from the cultures and isolates 
mentioned above, characterized by biological and 
morphological criteria, the next step was to endeavour to 

30 characterize the nucleic acid material associated with the 
viral particles produced in these cultures. 

The portions of the genome which have already 
been characterized have been used to develop tests for 
molecular detection of the viral genome and 

35 immunoserological tests, using the amino acid sequences 
encoded by the nucleotide sequences of the viral genome, 
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in order to detect the immune response directed against 
epitopes associated with the infection and/or viral 
expression. 

These tools have already enabled an association 
5 to be confirmed between MS and the expression of the 
sequences identified in the patents cited later. However, 
the viral system discovered by the Applicant is related to 
a complex retroviral system. In effect, the sequences to 
be found encapsidated in the extracellular viral particles 

10 produced by the different cultures of cells of patients 
suffering from MS show clearly that there is 
coencapsidation of retroviral genomes which are related 
but different from the "wild-type" retroviral genome which 
produces the infective viral particles. This phenomenon 

15 has been observed between replicative retroviruses and 
endogenous retroviruses belonging to the same family, or 
even heterologous retroviruses. The notion of endogenous 
retroviruses is very important in the context of our 
discovery since, in the case of MSRV-1, it has been 

20 observed that endogenous retroviral sequences comprising 
sequences homologous to the MSRV-l genome exist in normal 
human DNA. The existence of endogenous retroviral elements 
(ERV) related to MSRV-1 by all or part of their genome 
explains the fact that the expression of the MSRV-1 

25 retrovirus in human cells is able to interact with closely 
related endogenous sequences. These interactions are to be 
found in the case of pathogenic and/or infectious 
endogenous retroviruses (for example some ecotropic 
strains of the murine leukaemia virus) , and in the case of 

30 exogenous retroviruses whose nucleotide sequence may be 
found partially or wholly, in the form of ERVs, in the 
host animal's genome (e.g. mouse exogenous mammary tumor 
virus transmitted via the milk) . These interactions 
consist mainly of (i) a trans-activation or coactivation 

3 5 of ERVs by the replicative retrovirus (ii) and 
"illegitimate" encapsidation of RNAs related to ERVS, or 
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of ERVs - or even of cellular RNAs - simply possessing 
compatible encapsidation sequences, in the retroviral 
particles produced by the expression of the replicative 
strain, which are sometimes transmissible and sometimes 
5 with a pathogenicity of their own, and (iii) more or less 
substantial recombinations between the coencapsidated 
genomes, in particular in the phases of reverse 
transcription, which lead to the formation of hybrid 
genomes, which are sometimes transmissible and sometimes 

10 with a pathogenicity of their own. 

Thus, (i) different sequences related to MSRV-1 
have been found in the purified viral particles; (ii) 
molecular analysis of the different regions of the MSRV-1 
retroviral genome should be carried out by systematically 

15 analyzing the coencapsidated, interfering and/or 
recombined sequences which are generated by the infection 
and/or expression of MSRV-1; furthermore, some clones may 
have defective sequence portions produced by the 
retroviral replication and template errors and/or errors 

20 of transcription of the reverse transcriptase; (iii) the 
families of sequences related to the same retroviral 
genomic region provide the means for an overall diagnostic 
detection which may be optimized by the identification of 
invariable regions among the clones expressed, and by the 

25 identification of reading frames responsible for the 
production of antigenic and/or pathogenic polypeptides 
which may be produced only by a portion, or even by just 
one, of the clones expressed, and, under these conditions, 
the systematic analysis of the clones expressed in the 

30 region of a given gene enables the frequency of variation 
and/or of recombination of the MSRV-1 genome in this 
region to be evaluated and the optimal sequences for the 
applications, in particular diagnostic applications, to be 
defined; (iv) the pathology caused by a retrovirus such as 

35 MSRV-1 may be a direct effect of its expression and of the 
proteins or peptides produced as a result thereof, but 
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also an effect of the activation, the encapsidation or the 
recombination of related or heterologous genomes and of 
the proteins or peptides produced as a result thereof; 
thus, these genomes associated with the expression of 
5 and/or infection by MSRV-1 are an integral part of the 
potential pathogenicity of this virus, and hence 
constitute means of diagnostic detection and special 
therapeutic targets. Similarly, any agent associated with 
or cofactor of these interactions responsible for the 

10 pathogenesis in question, such as MSRV-2 or the gliotoxic 
factor which are described in the patent application 
published under No, FR-2 , 716 , 198 , may participate in the 
development of an overall and very effective strategy for 
the diagnosis, prognosis, therapeutic monitoring and/or 

15 integrated therapy of MS in particular, but also of any 
other disease associated with the same agents. 

In this context, a parallel discovery has been 
made in another autoimmune disease, rheumatoid arthritis 
(RA) , which has been described in the French Patent 

20 Application filed under No, 95/02960. This discovery shows 
that, by applying methodological approaches similar to the 
ones which were used in the Applicant's work on MS, it was 
possible to identify a retrovirus expressed in RA which 
shares the sequences described for MSRV-1 in MS, and also 

25 the coexistence of an associated MSRV-2 sequence also 
described in MS, As regards MSRV-1, the sequences detected 
in common in MS and RA relate to the pol and gag genes. In 
the current state of knowledge, it is possible to 
associate the gag and pol sequences described with the 

30 MSRV-1 strains expressed in these two diseases. 

The present patent application relates to 
various results which are additional to those already 
protected by the following French Patent Applications: 
- No. 92/04322 of 03,04,1992, published under 

35 No. 2,689,519; 
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- No. 92/13447 of 03.11.1992, published under 
No. 2,689,521; 

- No. 92/13443 of 03.11.1992, published under 
No. 2,689,520; 

5 _ No. 94/01529 of 04.02.1994, published under 

No. 2,715,936; 

- No. 94/01531 of 04.02.1994, published under 
No, 2,715,939; 

- No. 94/01530 of 04.02.1994, published under 
10 No. 2,715,936; 

- No, 94/01532 of 04.02.1994, published under 
No. 2,715,937; 

- No. 94/14322 of 24.11.1994, published under 
No, 2,727,428; 

15 - and No. 94/15810 of 23.12.1994; published under 
No. 2,728,585. 

The present invention relates, in the first 
place, to a viral material, in the isolated or purified 
state, which may be recognized or characterized in 

20 different ways: 

- its genome comprises a nucleotide sequence chosen from 
the group including the sequences SEQ ID NO: 46, SEQ ID 
NO: 51, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 56, SEQ ID 
NO:58, SEQ ID NO:59, SEQ ID NO:60, SEQ ID NO:61, SEQ ID 

25 NO: 89, their complementary sequences and their equivalent 
sequences, in particular nucleotide sequences displaying, 
for any succession of 100 contiguous monomers, at least 
50% and preferably at least 70% homology with the said 
sequences SEQ ID NO: 46, SEQ ID NO: 51, SEQ ID NO: 52, SEQ ID 

30 NO:53, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:59, SEQ ID 
NO: 60 SEQ ID NO: 61, SEQ ID NO: 89, respectively, and their 
complementary sequences ; 

- the region of its genome comprising the env and pol 
genes and a portion of the gag gene, excluding the 

35 subregion having a sequence identical or equivalent to 
SEQ ID NO:l, codes for any polypeptide displaying, for any 
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contiguous succession of at least 30 amino acids, at least 
50% and preferably at least 70% homology with a peptide 
sequence encoded by any nucleotide sequence chosen from 
the group including SEQ ID NO: 46, SEQ ID NO: 51, SEQ ID 
5 NO:52, SEQ ID NO:53, SEQ ID N0:56, SEQ ID NO:58, SEQ ID 
NO: 59, SEQ ID NO: 60 SEQ ID NO: 61 SEQ ID NO: 89 and their 
complementary sequences ; 

- the pol gene comprises a nucleotide sequence partially 
or totally identical or equivalent to SEQ ID NO: 57 or SEQ 

10 ID NO: 93, excluding SEQ ID NO:l. 

- the gag gene comprises a nucleotide sequence partially 
or totally identical or equivalent to SEQ ID NO: 88. 

As indicated above, according to the present 
invention, the viral material as defined above is 

15 associated with MS. And as defined by reference to the pol 
or gag gene of MSRV-1, and more especially to the 
sequences SEQ ID NOS 51, 56, 57, 59, 60, 61, 88, 89, 93, 
169, 170, 171, 172, 176, 177, 178 and 179, this viral 
material is associated with RA, 

20 The present invention also relates to a nucleic 

material, in the isolated or purified state, having at 
least one of the following definitions : 

- a nucleic material comprising a nucleotide sequence 
selected from the group including sequences SEQ ID NO: 93, 

25 SEQ ID NO:94, SEQ ID NO: 169, SEQ ID NO: 170, SEQ ID N0:171, 
SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, their complementary 

sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 

30 contiguous monomers, at least 50% and preferably at least 
60% homology with said sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID N0:170, SEQ ID N0:171, 
SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, and their complementary 

35 sequences, excluding HSERV-9 (or ERV-9) ; advantageously, 
the nucleotide sequence of said nucleic material is 
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selected from the group including sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID N0:169, SEQ ID NO:170, SEQ ID NO:171, 
SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, their complementary 

5 sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 70% and preferably at least 
80% homology with said sequences SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID N0:169, SEQ ID NO:170, SEQ ID N0:171, 
10 SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, and their complementary 
sequences ; 

- a nucleic material, in the isolated or purified state, 
coding for any polypeptide displaying, for any contiguous 

15 succession of at least 30 amino acids, at least 50%, 
preferably at least 60 %, and most preferably at least 70% 
homology with a peptide sequence encoded by any nucleotide 
sequence selected from the group including SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO: 169, SEQ ID NO: 170, SEQ ID NO:171, 

20 SEQ ID NO: 172, SEQ ID NO: 176, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179 and their complementary 
sequences; 

- a nucleic material, in the isolated or purified state, 
of retroviral type, comprising a nucleotide sequence 

25 identical or similar to at least part of the pol gene of 
an isolated retrovirus associated with multiple sclerosis 
or rheumatoid arthritis; advantageously, said nucleotide 
sequence is 80 % similar to said at least part of the gene 
pol; 

30 - a nucleic material comprising a nucleotide sequence 
identical or similar to at least part of the pol gen of an 
isolated virus encoding a reverse transcriptase having a 
enzymatic site comprised between the amino acid domains 
LPQG-YXDD, having a phylogenic distance with HSERV-9 of 

35 0.063 ± 0.1, and preferably 0.063 ± 0.05; the phylogenic 
distances are calculated on the basis of a reference 
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10 

sequence according to UPGM tree option of the Geneworks™ 
Software ( INTELLIGENETICS) ; 

By enzymatic site, we understand the amino acids domain (s) 
conferring the specific activity of a given enzyme, 
5 The present invention also relates to different 

nucleotide fragments each comprising a nucleotide sequence 
chosen from the group including: 

(a) all the genomic sequences, partial and total, of the 
pol gene of the MSRV-1 virus, except for the total 

10 sequence of the nucleotide fragment defined by 
SEQ ID NO: 1; 

(b) all the genomic sequences, partial and total, of the 
env gene of MSRV-1; 

(c) all the partial genomic sequences of the gag gene of 
15 MSRV-1; 

(d) all the genomic sequences overlapping the pol gene and 
the env gene of the MSRV-l virus, and overlapping the pol 
gene and the gag gene; 

(e) all the sequences, partial and total, of a clone 
20 chosen from the group including the clones FBd3 

(SEQ ID NO:46), t pol (SEQ ID NO:51), JLBcl 

(SEQ ID NO:52), JLBc2 (SEQ ID NO:53) and GM3 

(SEQ ID NO:56), FBdl3 (SEQ ID NO:58), LB19 (SEQ ID NO:59), 
LTRGAG12 (SEQ ID NO: 60), FP6 (SEQ ID NO: 61), G+E+A 
25 (SEQ ID NO:89), excluding any nucleotide sequence 
identical to or lying within the sequence defined by 
SEQ ID NO: 1; 

(f) sequences complementary to the said genomic sequences; 

(g) sequences equivalent to the said sequences (a) to (e) , 
30 in particular nucleotide sequences displaying, for any 

succession of 100 contiguous monomers, at least 50% and 
preferably at least 70% homology with the said sequences 
(a) to (d) , 

provided that this nucleotide fragment does not comprise 
3 5 or consist of the sequence ERV-9 as described in LA MANTIA 
et al, (18) . 
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The term genomic sequences, partial or total, 
includes all sequences associated by coencapsidation or by 
coexpression , or recombined sequences. 

Preferably, such a fragment comprises: 
5 - either a nucleotide sequence identical to a partial or 
total genomic sequence of the pol gene of the MSRV-1 
virus, except for the total sequence of the nucleotide 
fragment defined by SEQ ID NO:l, or identical to any 
sequence equivalent to the said partial or total genomic 
10 sequence, in particular one which is homologous to the 
latter; 

- or a nucleotide sequence identical to a partial or total 
genomic sequence of the env gene of the MSRV-1 virus, or 
identical to any sequence complementary to the said 

15 nucleotide sequence, or identical to any sequence 

equivalent to the said nucleotide sequence, in particular 

one which is homologous to the latter. 

In particular, the invention relates to a 

nucleotide fragment comprising a coding nucleotide 
20 sequence which is partially or totally identical to a 

nucleotide sequence chosen from the group including: 

- the nucleotide sequence defined by SEQ ID NO: 40, SEQ ID 
NO: 62 or SEQ ID NO:89; 

- sequences complementary to SEQ ID NO: 40, SEQ ID NO: 62 or 
25 SEQ ID NO: 89; 

- sequences equivalent, and in particular homologous to 
SEQ ID NO:40, SEQ ID NO:62 or SEQ ID NO:89; 

- sequences coding for all or part of the peptide sequence 
defined by SEQ ID NO: 39, SEQ ID NO: 63 or SEQ ID NO: 90; 

30 - sequences coding for all or part of a peptide sequence 
equivalent, in particular homologous to SEQ ID NO: 39, SEQ 
ID NO: 63 or SEQ ID NO: 90, which is capable of being 
recognized by sera of patients infected with the MSRV-1 
virus, or in whom the MSRV-1 virus has been reactivated. 
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The invention also relates to a nucleotide 
fragment (called fragment I) having at least one of the 
following definitions : 

- a nucleotide fragment comprising a nucleotide sequence 
5 selected from the group including SEQ ID NO: 93, 
SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, SEQ ID NO:171, 
SEQ ID NO: 172, SEQ ID NO: 17 6, SEQ ID NO: 177, 

SEQ ID NO: 178, SEQ ID NO: 179, their complementary 

sequences and their equivalent sequences, in particular 
10 nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 50% and preferably at least 
60% homology with said sequences and their complementary 
sequences, said group excluding SEQ ID N0:1, 

said nucleotide fragment not comprising nor consisting of 

15 the sequence HSERV-9 (or ERV-9) ; preferably the nucleotide 
sequence of said fragment is selected from the group 
including SEQ ID NO: 93, SEQ ID NO: 94, SEQ ID NO: 169, 
SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, 

SEQ ID NO: 176, SEQ ID NO: 177, SEQ ID NO: 178, 

20 SEQ ID NO: 179, their complementary sequences and their 
equivalent sequences, in particular nucleotide sequences 
displaying, for any succession of 100 contiguous monomers, 
at least 70% and preferably at least 80% homology with 
said sequences and their complementary sequences; 

25 - a nucleotide fragment comprising a coding nucleotide 
sequence which is partially or totally identical to a 
nucleotide sequence selected from the group including : 

SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:169, 

SEQ ID NO: 170, SEQ ID NO: 171, SEQ ID NO: 172, 

30 SEQ ID N0:176, SEQ ID NO:177, SEQ ID NO:178, 

SEQ ID NO: 179 ; their complementary sequences ; their 
equivalent sequences, in particular homologous to 
SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:169, SEQ ID NO:170, 
SEQ ID NO: 171, SEQ ID NO: 172, SEQ ID NO: 176, 

35 SEQ ID N0:177, SEQ ID NO:178, SEQ ID NO:179; 
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sequences encoding all or parts of the peptide 
sequence defined by SEQ ID NO: 95, SEQ ID NO: 173, 
SEQ ID NO: 174, SEQ ID NO: 175, SEQ ID NO: 180, 

SEQ ID NO: 181, SEQ ID NO: 182; 
5 sequences encoding all or parts of a peptide 

sequence equivalent, in particular homologous to 
SEQ ID NO: 95, SEQ ID NO: 173, SEQ ID NO: 174, SEQ ID NO: 175, 
SEQ ID NO: 180, SEQ ID NO: 181, SEQ ID NO: 182, which is 
capable of being recognized by sera of patients infected 

10 with the MSRV-1 virus, or in whom the MSRV-1 virus has 
been reactivated. 

The invention also relates to any nucleic acid 
probe for the detection of virus associated with MS and/or 
rheumatoid arthritis (RA) , which is capable of hybridizing 

15 specifically with any fragment such as is defined above, 
belonging or lying within the genome of the said 
pathogenic agent. It relates, in addition, to any nucleic 
acid probe for detection of a pathogenic and/or infective 
agent associated with RA, which is capable of hybridizing 

20 specifically with any fragment as defined above by 
reference to the pol and gag genes, and especially with 
respect to the sequences SEQ ID NOS 40, 51, 56, 59, 60, 
61, 62, 89 and SEQ ID NOS 39, 63 and 90. 

The invention also relates to a primer for the 

25 amplification by polymerization of an RNA or a DNA of a 
viral material, associated with MS and/or RA, comprising a 
nucleotide sequence identical or equivalent to at least 
one portion of the nucleotide sequence of any fragment 
such as is defined above, in particular a nucleotide 

30 sequence displaying, for any succession of at least 10 
contiguous monomers, preferably 15 contiguous monomers, 
more preferably 18 contiguous monomers and even most 
preferably 2 0 contiguous monomers, at least 7 0% homology 
with at' least the said portion of the said fragment. 

35 Preferably, the nucleotide sequence of such a primer is 
identical to any one of the sequences selected from the 
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group including SEQ ID NO: 47 to SEQ ID NO: 50, 

SEQ ID NO:55, SEQ ID NO:64, SEQ ID NO:86, SEQ ID NO:99 to 
SEQ ID NO: 111, SEQ ID NO: 183, SEQ ID NO: 184, 

SEQ ID NO: 185, SEQ ID NO: 186. 
5 Generally speaking the invention also 

encompasses any RNA or DNA, and in particular replication 
vector, comprising a genomic fragment of the viral 
material such as is defined above, or a nucleotide 
fragment such as is defined above. 

10 The invention also relates to the different 

peptides encoded by any open reading frame belonging to a 
nucleotide fragment such as is defined above, in 
particular any polypeptide, for example any oligopeptide 
forming or comprising an antigenic determinant recognized 

15 by sera of patients infected with the MSRV-1 virus and/or 
in whom the MSRV-1 virus has been reactivated. Preferably, 
this polypeptide is antigenic, and is encoded by the open 
reading frame beginning, in the 5 • -3 ' direction, at 
nucleotide 181 and ending at nucleotide 330 of 

20 SEQ ID NO:l. 

The invention also encompasses the following 
polypeptides : 
a) 

- a polypeptide encoded by any open reading frame 
25 belonging to a nucleotide fragment, fragment I, as defined 

above ; 

- a polypeptide, characterized in that the open reading 
frame encoding it, is comprised, in the 5 '-3' direction, 
between nucleotide 18 and nucleotide 2304 of SEQ ID NO:93; 

30 - a polypeptide, having a peptide sequence comprising a 
sequence partially or totally identical to SEQ ID NO:95; 
b) 

- a polypeptide, recombinant or synthetic, having a 
peptide sequence which comprises a sequence identical or 

35 equivalent to SEQ ID NO: 96; in particular said polypeptide 
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15 

exhibits an enzymatic activity consisting of proteolytic 
activity ; 

- a polypeptide, recombinant or synthetic, characterized 
in that the open reading frame encoding it begins, in the 

5 5 '-3' direction, at nucleotide 18 and ends at nucleotide 
340 of SEQ ID NO: 93; 

- a polypeptide having an inhibitory activity on the 
proteolytic activity of a polypeptide as defined according 
to b) ; 

10 c) 

- a polypeptide, recombinant or synthetic, having a 
peptide sequence which comprises a sequence identical or 
equivalent to SEQ ID NO: 97; in particular said polypeptide 
exhibits a reverse transcriptase activity; 

15 - a polypeptide having a peptide sequence which comprises 
a sequence identical or equivalent to SEQ ID NO: 98; in 
particular said polypeptide exhibits a ribonuclease 
activity; 

- a polypeptide, recombinant or synthetic, characterized 
20 in that the open reading frame encoding it begins, in the 

5 '-3' direction, at nucleotide 34 1 and ends at nucleotide 
2304 of SEQ ID NO:93; 

- a polypeptide, recombinant or synthetic, characterized 
in that the open reading frame encoding it begins, in the 

25 5«-3' direction, at nucleotide 1858 and ends at nucleotide 
2304 of SEQ ID NO:93. 

- a polypeptide having an inhibitory activity on the 
reverse transcriptase activity of a polypeptide as defined 
according to c) or on the ribonuclease H activity of a 

30 polypeptide as defined according to c) . 

In particular, the invention relates to an 
antigenic polypeptide recognized by the sera of patients 
infected with the MSRV-l virus, and/or in whom the MSRV-1 
virus has been reactivated, whose peptide sequence is 

35 partially or totally identical or is equivalent to the 
sequence defined by SEQ ID NO:39, SEQ ID NO:63, 
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SEQ ID NO:87, SEQ ID NO:95, SEQ ID NO:96, SEQ ID NO:97, 
SEQ ID N0:98, SEQ ID NO: 173, SEQ ID NO: 174, SEQ ID NO: 175, 
SEQ ID NO: 180, SEQ ID NO: 181 and SEQ ID NO: 182; such a 
sequence is identical, for example, to any sequence 
5 selected from the group including the sequences 
SEQ ID NO: 41 to SEQ ID NO: 44, SEQ ID NO: 63 and 

SEQ ID NO: 87. 

The present invention also proposes mono- or 
polyclonal antibodies directed against the MSRV-1 virus, 
10 which are obtained by the immunological reaction of a 
human or animal body or cells to an immunogenic agent 
consisting of an antigenic polypeptide such as is defined 
above. 

The invention next relates to: 

15 - reagents for detection of the MSRV- virus, or of an 
exposure to the latter, comprising, at least one reactive 
substance selected from the group consisting of a probe of 
the present invention, a polypeptide, in particular an 
antigenic peptide, such as is defined above, or an anti- 

20 ligand, in particular an antibody to the said polypeptide; 
- all diagnostic, prophylactic or therapeutic compositions 
comprising one or more peptides, in particular antigenic 
peptides, such as are defined above, or one or more anti- 
ligands, in particular antibodies to the peptides, 

25 discussed above; such a composition is preferably, and by 
way of example, a vaccine composition. 

The invention also relates to any diagnostic, 
prophylactic or therapeutic composition, in particular for 
inhibiting the expression of at least one virus associated 

30 with MS or RA, and/or the enzymatic activities of the 
proteins of said virus, comprising a nucleotide fragment 
such as is defined above or a polynucleotide, in 
particular oligonucleotide, whose sequence is partially 
identical to that of the said fragment, except for that of 

35 the fragment having the nucleotide sequence SEQ ID NO:l. 
Likewise, it relates to any diagnostic, prophylactic or 
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therapeutic composition, in particular for inhibiting the 
expression of at least one pathogenic and/or infective 
agent associated with RA, comprising a nucleotide fragment 
such as is defined above by reference to the pol and gag 
5 genes, and especially with respect to the sequences 
SEQ ID NOS 40, 51, 56, 59, 60, 61, 62 and 89. 

According to the invention, these same fragments 
or polynucleotides, in particular oligonucleotides, may 
participate in all suitable compositions for detecting, 

10 according to any suitable process or method, a patho- 
logical and/ or infective agent associated with MS and with 
RA, respectively, in a biological sample. In such a 
process, an RNA and/ or a DNA presumed to belong or 
originating from the said pathological and/or infective 

15 agent, and/or their complementary RNA and/or DNA, is/are 
brought into contact with such a composition. 

The present invention also relates to any 
process for detecting the presence or exposure to such a 
pathological and/ or infective agent , in a biological 

20 sample, by bringing this sample into contact with a 
peptide, in particular an antigenic peptide such as is 
defined above, or an anti-ligand, in particular an anti- 
body to this peptide, such as is defined above. 

In practice, and for example, a device for 

25 detection of the MSRV-l virus comprises a reagent such as 
is defined above, supported by a solid support which is 
immunologically compatible with the reagent, and a means 
for bringing the biological sample, for example a sample 
of blood or of cerebrospinal fluid, likely to contain 

30 anti-MSRV-1 antibodies, into contact with this reagent 
under conditions permitting a possible immunological 
reaction, the foregoing items being accompanied by means 
for detecting the immune complex formed with this reagent. 

Lastly, the invention also relates to the detec- 

35 tion of anti-MSRV-1 antibodies in a biological sample, for 
example a sample of blood or of cerebrospinal fluid. 
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according to which this sample is brought into contact 
with a reagent such as is defined above, consisting of an 
antibody, under conditions permitting their possible 
immunological reaction, and the presence of the immune 
5 complex thereby formed with the reagent is then detected. 

Before describing the invention in detail, 
different terms used in the description and the claims are 
now defined: 

- strain or isolate is understood to mean any 
10 infective and/or pathogenic biological fraction contain- 
ing, for example, viruses and/or bacteria and/or para- 
sites, generating pathogenic and/or antigenic power, 
harboured by a culture or a living host; as an example, a 
viral strain according to the above definition can contain 

15 a coinfective agent, for example a pathogenic protist, 

- the term "MSRV" used in the present 
description denotes any pathogenic and/or infective agent 
associated with MS, in particular a viral species, the 
attenuated strains of the said viral species or the 

20 defective-interfering particles or particles containing 
coencapsidated genomes, or alternatively genomes 
recombined with a portion of the MSRV-1 genome, derived 
from this species. Viruses, and especially viruses 
containing RNA, are known to have a variability resulting, 

25 in particular, from relatively high rates of spontaneous 
mutation (7), which will be borne in mind below for 
defining the notion of equivalence, 

- human virus is understood to mean a virus 
capable of infecting, or of being harboured by human 

3 0 beings, 

- in view of all the natural or induced vari- 
ations and/or recombination which may be encountered when 
implementing the present invention, the subjects of the 
latter, defined above and in the claims, have been 

35 expressed including the equivalents or derivatives of the 
different biological materials defined below, in 
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particular of the homologous nucleotide or peptide 
sequences , 

- the variant of a virus or of a pathogenic 
and/or infective agent according to the invention 

5 comprises at least one antigen recognized by at least one 
antibody directed against at least one corresponding 
antigen of the said virus and/or said pathogenic and/or 
infective agent, and/or a genome any part of which is 
detected by at least one hybridization probe and/or at 

10 least one nucleotide amplification primer specific for the 
said virus and/or pathogenic and/or infective agent, such 
as, for example, for the MSRV-1 virus, the primers and 
probes having a nucleotide sequence chosen from 
SEQ ID NO:20 to SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:16 

15 to SEQ ID NO: 19, SEQ ID NO: 31 to SEQ ID NO: 33, 

SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:48, SEQ ID NO:49, 
SEQ ID NO: 50, SEQ ID NO: 45 and their complementary 
sequences, under particular hybridization conditions well 
known to a person skilled in the art, 

20 - according to the invention, a nucleotide 

fragment or an oligonucleotide or polynucleotide is an 
arrangement of monomers, or a biopolymer, characterized by 
the informational sequence of the natural nucleic acids, 
which is capable of hybridizing with any other nucleotide 

25 fragment under predetermined conditions, it being possible 
for the arrangement to contain monomers of different 
chemical structures and to be obtained from a molecule of 
natural nucleic acid and/or by genetic recombination 
and/or by chemical synthesis; a nucleotide fragment may be 

30 identical to a genomic fragment of the MSRV-1 virus 
discussed in the present invention, in particular a gene 
of this virus, for example pol or env in the case of the 
said virus, 

- thus, a monomer can be a natural nucleotide of 
3 5 nucleic acid whose constituent elements are a sugar, a 

phosphate group and a nitrogenous base; in RNA the sugar 
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is ribose, in DNA the sugar is 2-deoxyribose; depending on 
whether the nucleic acid is DNA or RNA, the nitrogenous 
base is chosen from adenine, guanine, uracil, cytosine and 
thymine; or the nucleotide can be modified in at least one 
5 of the three constituent elements; as an example, the 
modification can occur in the bases, generating modified 
bases such as inosine, 5-methyldeoxycytidine, 

deoxyuridine , 5- (dimethylamino) deoxyuridine , 2,6- 

diaminopur ine, 5-bromodeoxyuridine and any other modified 

10 base promoting hybridization; in the sugar, the 
modification can consist of the replacement of at least 
one deoxyribose by a polyamide (8) , and in the phosphate 
group, the modification can consist of its replacement by 
esters chosen, in particular, from diphosphate, alkyl- and 

15 arylphosphonate and phosphorothioate esters, 

- "informational sequence" is understood to mean 
any ordered succession of monomers whose chemical nature 
and order in a reference direction constitute or otherwise 
an item of functional information of the same quality as 

20 that of the natural nucleic acids, 

- hybridization is understood to mean the 
process during which, under suitable working conditions, 
two nucleotide fragments having sufficiently complementary 
sequences pair to form a complex structure, in particular 

25 double or triple, preferably in the form of a helix, 

- a probe comprises a nucleotide fragment syn- 
thesized chemically or obtained by digestion or enzymatic 
cleavage of a longer nucleotide fragment, comprising at 
least six monomers, advantageously from 10 to 1000 mono- 

30 mers, preferably 10 to 30 monomers and more preferably 18 
to 30, and possessing a specificity of hybridization under 
particular conditions; preferably, a probe possessing 
fewer than 10 monomers, but preferably fewer than 15 
monomers is not used alone, but is used in the presence of 

35 other probes of equally short size or otherwise; under 
certain special conditions, it may be useful to use probes 
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of size greater than 100 monomers ; a probe may be used , in 
particular , for diagnostic purposes , such molecules being, 
for example, capture and/or detection probes, 

- the capture probe may be immobilized on a 
5 solid support by any suitable means, that is to say 

directly or indirectly, for example by covalent bonding or 
passive adsorption, 

- the detection probe may be labelled by means 
of a label chosen, in particular, from radioactive 

10 isotopes, enzymes chosen, in particular, from peroxidase 
and alkaline phosphatase and those capable of hydrolysing 
a chromogenic, fluorogenic or luminescent substrate, 
chromophoric chemical compounds, chromogenic, fluorogenic 
or luminescent compounds, nucleotide base analogues and 

15 biotin, 

- the probes used for diagnostic purposes of the 
invention may be employed in all known hybridization 
techniques, and in particular the techniques termed "DOT- 
BLOT" (9), "SOUTHERN BLOT" (10), "NORTHERN BLOT", which is 

20 a technique identical to the "SOUTHERN BLOT" technique but 
which uses RNA as target, and the SANDWICH technique (11); 
advantageously, the SANDWICH technique is used in the 
pj-esent invention, comprising a specific capture probe 
and/or a specific detection probe, on the understanding 

25 that the capture probe and the detection probe must 
possess an at least partially different nucleotide 
sequence , 

- any probe according to the present invention 
can hybridize in vivo or in vitro with RNA and /or with DNA 

3 0 in order to block the phenomena of replication , in 
particular translation and/or transcription, and/or to 
degrade the said DNA and/or RNA, 

- a primer is a probe comprising at least six 
monomers, and advantageously from 10 to 3 0 monomers, and 

35 preferably from 18 to 25 monomers, possessing a 
specificity of hybridization under particular conditions 
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for the initiation of an enzymatic polymerization, for 
example in an amplification technique such as PGR 
(polymerase chain reaction) , in an elongation process such 
as sequencing, in a method of reverse transcription or the 
5 like, 

- two nucleotide or peptide sequences are termed 
equivalent or derived with respect to one another, or with 
respect to a reference sequence, if functionally the 
corresponding biopolymers can perform substantially the 

10 same role, without being identical, as regards the 
application or use in question, or in the technique in 
which they participate; two sequences are, in particular, 
equivalent if they are obtained as a result of natural 
variability, in particular spontaneous mutation of the 

15 species from which they have been identified, or induced 
variability, as are two homologous sequences, homology 
being defined below, 

- "variability" is understood to mean any 
spontaneous or induced modification of a sequence, in par- 

20 ticular by substitution and/or insertion and/or deletion 
of nucleotides and/or of nucleotide fragments, and/or 
extension and/or shortening of the sequence at one or both 
ends; an unnatural variability can result from the genetic 
engineering techniques used, for example the choice of 

25 synthesis primers, degenerate or otherwise, selected for 
amplifying a nucleic acid; this variability can manifest 
itself in modifications of any starting sequence, 
considered as reference, and capable of being expressed by 
a degree of homology relative to the said reference 

3 0 sequence, 

- homology characterizes the degree of identity 
of two nucleotide or peptide fragments compared; it is 
measured by the percentage identity which is determined, 
in particular, by direct comparison of nucleotide or 

35 peptide sequences, relative to reference nucleotide or 
peptide sequences , 
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- this percentage identity has been specifically 
determined for the nucleotide fragments, clones in 
particular, dealt with in the present invention, which are 
homologous to the fragments identified, for the MSRV-1 

5 virus, by SEQ ID NO:l to NO: 9, SEQ ID NO:46, SEQ ID N0:51 
to SEQ ID NO: 53, SEQ ID N0:40, SEQ ID NO: 56, SEQ ID NO: 57 
and SEQ ID NO: 93, as well as for the probes and primers 
homologous to the probes and primers identified by SEQ ID 
NO: 20 to SEQ ID NO: 24, SEQ ID NO: 26, SEQ ID NO: 16 to SEQ 

10 ID NO: 19, SEQ ID NO: 31 to SEQ ID NO:33, SEQ ID NO:45, SEQ 
ID NO:47, SEQ ID NO:48, SEQ ID NO:49; SEQ ID NO:50, SEQ ID 
NO:55, SEQ ID NO:40, SEQ ID NO:56, SEQ ID NO:57 and SEQ ID 
NO:99 to SEQ ID NO: 111; as an example, the smallest 
percentage identity observed between the different general 

15 consensus sequences of nucleic acids obtained from 
fragments of MSRV-1 viral RNA, originating from the LM7PC 
and PLI-2 lines according to a protocol detailed later, is 
67% in the region described in Figure 1 , 

- any nucleotide fragment is termed equivalent 
20 or derived from a reference fragment if it possesses a 

nucleotide sequence equivalent to the sequence of the 
reference fragment; according to the above definition, the 
following in particular are equivalent to a reference 
nucleotide fragment: 
25 a) any fragment capable of hybridizing at least 

partially with the complement of the reference fragment, 

b) any fragment whose alignment with the refer- 
ence fragment results in the demonstration of a larger 
number of identical contiguous bases than with any other 

30 fragment originating from another taxonomic group, 

c) any fragment resulting, or capable of result- 
ing, from the natural variability of the species from 
which it is obtained, 

d) any fragment capable of resulting from the 
3 5 genetic engineering techniques applied to the reference 

fragment , 
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e) any fragment containing at least eight 
contiguous nucleotides encoding a peptide which is 
homologous or identical to the peptide encoded by the 
reference fragment , 
5 f) any fragment which is different from the 

reference fragment by insertion, deletion or substitution 
of at least one monomer, or extension or shortening at one 
or both of its ends; for example, any fragment 
corresponding to the reference fragment flanked at one or 
10 both of its ends by a nucleotide sequence not coding for a 
polypeptide , 

- polypeptide is understood to mean, in particu- 
lar, any peptide of at least two amino acids, in particu- 
lar an oligopeptide, or protein, and for example an 

15 enzyme, extracted, separated or substantially isolated or 
synthesized through human intervention, in particular 
those obtained by chemical synthesis or by expression in a 
recombinant organism, 

- polypeptide partially encoded by a nucleotide 
20 fragment is understood to mean a polypeptide possessing at 

least three amino acids encoded by at least nine 
contiguous monomers lying within the said nucleotide 
fragment, 

- an amino acid is termed analogous to another 
25 amino acid when their respective physicochemical prop- 
erties, such as polarity, hydrophobicity and/or basicity 
and/or acidity and/or neutrality are substantially the 
same; thus, a leucine is analogous to an isoleucine, 

- any polypeptide is termed equivalent or 
30 derived from a reference polypeptide if the polypeptides 

compared have substantially the same properties, and in 
particular the same antigenic, immunological, 
enzymological and/or molecular recognition properties; the 
following in particular are equivalent to a reference 
35 polypeptide: 
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a) any polypeptide possessing a sequence in 
which at least one amino acid has been replaced by an 
analogous amino acid, 

b) any polypeptide having an equivalent peptide 
5 sequence, obtained by natural or induced variation of the 

said reference polypeptide and/or of the nucleotide 
fragment coding for the said polypeptide, 

c) a mimotope of the said reference polypeptide, 

d) any polypeptide in whose sequence one or more 
10 amino acids of the L series are replaced by an amino acid 

of the D series, and vice versa, 

e) any polypeptide into whose sequence a modifi- 
cation of the side chains of the amino acids has been 
introduced, such as, for example, an acetylation of the 

15 amine functions, a carboxylation of the thiol functions, 
an esterif ication of the carboxyl functions, 

f) any polypeptide in whose sequence one or more 
peptide bonds have been modified, such as, for example, 
carba, retro, inverse, retro-inverso , reduced and methy- 

2 0 lenoxy bonds, 

(g) any polypeptide at least one antigen of 
which is recognized by an antibody directed against a 
reference polypeptide , 

- the percentage identity characterizing the 
25 homology of two peptide fragments compared is, according 
to the present invention, at least 50% and preferably at 
least 70%. 

In view of the fact that a virus possessing 
reverse transcriptase enzymatic activity may be geneti- 

30 cally characterized equally well in RNA and in DNA form, 
both the viral DNA and RNA will be referred to for 
characterizing the sequences relating to a virus possess- 
ing such reverse transcriptase activity, termed MSRV-1 
according to the present description. 

35 The expressions of order used in the present 

description and the claims, such as "first nucleotide 
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sequence", are not adopted so as to express a particular 
order, but so as to define the invention more clearly. 

Detection of a substance or agent is understood 
below to mean both an identification and a quantification, 
5 or a separation or isolation, of the said substance or 
said agent. 

A better understanding of the invention will be 
gained on reading the detailed description which follows, 
prepared with reference to the attached figures, in which: 

10 - Figure 1 shows general consensus sequences of 

nucleic acids of the MSRV-IB clones amplified by the PGR 
technique in the "pol" region defined by Shih (12) , from 
viral DNA originating from the LM7PC and PLI-2 lines, and 
identified under the references SEQ ID NO: 3, SEQ ID NO: 4, 

15 SEQ ID NO: 5 and SEQ ID NO: 6, and the common consensus with 
amplification primers bearing the reference SEQ ID NO:7; 

- Figure 2 gives the definition of a functional 
reading frame for each MSRV-1B/"PCR pol" type family, the 
said families A to D being defined, respectively, by the 

20 nucleotide sequences SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 
and SEQ ID NO: 6 described in Figure 1; 

- Figure 3 gives an example of consensus of the 
MSRV-2B sequences, identified by SEQ ID NO:ll; 

- Figure 4 is a representation of the reverse 
25 transcriptase (RT) activity in dpm (disintegrations per 

minute) in the sucrose fractions taken from a purification 
gradient of the virions produced by the B lymphocytes in 
culture from a patient suffering from MS; 

- Figure 5 gives, under the same experimental 
30 conditions as in Figure 4, the assay of the reverse 

transcriptase activity in the culture of a B lymphocyte 
line obtained from a control free from MS; 

- Figure 6 shows the nucleotide sequence of the 
clone PSJ17 (SEQ ID NO:9); 

35 - Figure 7 shows the nucleotide sequence SEQ ID 

NO: 8 of the clone designated M003-P004; 
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- Figure 8 shows the nucleotide sequence SEQ ID 
NO: 2 of the clone Fll-1; the portion located between the 
two arrows in the region of the primer corresponds to a 
variability imposed by the choice of primer which was used 

5 for the cloning of Fll-1; in this same figure, the 
translation into amino acids is shown; 

- Figure 9 shows the nucleotide sequence SEQ ID 
N0:1, and a possible functional reading frame of SEQ ID 
N0:1 in terms of amino acids; on this sequence, the 

10 consensus sequences of the pol gene are underlined; 

- Figures 10 and 11 give the results of a PGR, 
in the form of a photograph under ultraviolet light of an 
ethidium bromide-impregnated agarose gel, of the amplifi- 
cation products obtained from the primers identified by 

15 SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19; 

- Figure 12 gives a representation in matrix 
form of the homology between SEQ ID N0:1 of MSRV-1 and 
that of an endogenous retrovirus designated HSERV9; this 
homology of at least 65% is demonstrated by a continuous 

20 line, the absence of a line meaning a homology of less 
than 65%; 

- Figure 13 shows the nucleotide sequence SEQ ID 
NO: 46 of the clone FBd3 ; 

- Figure 14 shows the sequence homology between 
25 the clone FBd3 and the HSERV-9 retrovirus; 

- Figure 15 shows the nucleotide sequence SEQ ID 
NO: 51 of the clone t pol; 

- Figures 16 and 17 show, respectively, the 
nucleotide sequences SEQ ID NO: 52 and SEQ ID NO: 53 of the 

30 clones JLBcl and JLBc2 , respectively; 

- Figure 18 shows the sequence homology between 
the clone JLBcl and the clone FBd3 ; 

- and Figure 19 the sequence homology between 
the clone JLBc2 and the clone FBd3 ; 

3 5 - Figure 2 0 shows the sequence homology between 

the clones JLBcl and JLBc2 ; 
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- Figures 21 and 22 show the sequence homology 
between the HSERV-9 retrovirus and the clones JLBcl and 
JLBC2 , respectively ; 

- Figure 23 shows the nucleotide sequence SEQ ID 
5 NO: 56 of the clone GM3 ; 

- Figure 2 4 shows the sequence homology between 
the HSERV-9 retrovirus and the clone GM3 ; 

- Figure 2 5 shows the localization of the 
different clones studied, relative to the genome of the 

10 known retrovirus ERV9 ; 

- Figure 26 shows the position of the clones 
Fll-1, M003-P004, MSRV-IB and PSJ17 in the region 
hereinafter designated MSRV-1 pol*; 

- Figure 27, split into three successive Figures 
15 27a-27c, shows a possible reading frame covering the whole 

of the pol gene; 

- Figure 28 shows, according to SEQ ID NO:40, 
the nucleotide sequence coding for the peptide fragment 
POL2B, having the amino acid sequence identified by SEQ ID 

20 NO:39; 

- Figure 29 shows the OD values (ELISA tests) at 
492 nin obtained for 29 sera of MS patients and 32 sera of 
healthy controls tested with an anti-IgG antibody; 

- Figure 30 shows the OD values (ELISA tests) at 
25 492 nm obtained for 3 6 sera of MS patients and 42 sera of 

healthy controls tested with an anti-IgM antibody; 

- Figures 3 1 to 33 show the results obtained 
(relative intensity of the spots) for 43 overlapping 
octapeptides covering the amino acid sequence 61-110, 

30 according to the Spotscan technique, respectively with a 
pool of MS sera, with a pool of control sera and with the 
pool of MS sera after deduction of a background corre- 
sponding to the maximum signal detected on at least one 
octapeptide with the control serum (intensity =1), on the 

35 understanding that these sera were diluted to 1/50. The 
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bar at the far right-hand end represents a graphic scale 
standard unrelated to the serological test; 

- Figure 3 4 shows the SEQ ID NO: 41 and SEQ ID 
NO: 42 of two polypeptides comprising immunodominant 

5 regions, while SEQ ID NO: 43 and 44 represent 
immunoreactive polypeptides specific to MS; 

- Figure 3 5 shows the nucleotide sequence SEQ ID 
NO: 59 of the clone LB19 and three potential reading frames 
of SEQ ID NO: 59 in terms of amino acids; 

10 - Figure 36 shows the nucleotide sequence SEQ ID 

NO: 88 (GAG*) and a potential reading frame of SEQ ID NO: 88 
in terms of amino acids; 

- Figure 3 7 shows the sequence homology between 
the clone FBdl3 and the HSERV-9 retrovirus; according to 

15 this representation, the continuous line means a 
percentage homology greater than or equal to 70% and the 
absence of a line means a smaller percentage homology; 

- Figure 38 shows the nucleotide sequence SEQ ID 
NO: 61 of the clone FP6 and three potential reading frames 

20 of SEQ ID NO: 61 in terms of amino acids; 

- Figure 3 9 shows the nucleotide sequence SEQ ID 
NO: 89 of the clone G+E+A and three potential reading 
frames of SEQ ID NO: 89 in terms of amino acids; 

- Figure 4 0 shows a reading frame found in the 
25 region E and coding for an MSRV-1 retroviral protease 

identified by SEQ ID NO: 90; 

- Figure 41 shows the response of each serum of 
patients suffering from MS, indicated by the symbol (+) , 
and of healthy patients, symbolised by (-) , tested with an 

30 anti-IgG antibody, expressed as net optical density at 
4 9 2 nm ; 

- Figure 4 2 shows the response of each serum of 
patients suffering from MS, indicated by the symbols (+) 
and (QS) , and of healthy patients (-) , tested with an 

35 anti-IgM antibody, expressed as net optical density at 
492 nm ; 
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- Figure 4 3 shows the RT-activity profile in 
sucrose density gradients of pellets from B-cell lines 
supernatants; Control B-cell line ■ was obtained from the 
relative of a patient with mitochondr iopathy . MS B-Cell 

5 line □ was obtained from a patient with definite MS; 

- Figure 4 4 shows the nucleotide and amino acid 
alignment of the conserved pol regions of viruses detected 
in the study (cf Example 18) by the "Pan-retrovirus" PGR. 
"Deletions" are represented by dashes and standard single- 

10 letter abbreviations are used to designate amino acids and 
nucleotides (i = inosine) . The most highly conserved VLPQG 
and YXDD regions are shown as separate blocks in bold type 
at the end of each sequence. Amino acids which are present 
in all or in all but one of the sequences are underlined. 

15 PGR primers (modified from (12)) PAN-UO and PAN-UI are 
orientated 5' to 3* (sense) whereas primer PAN-DI is 3 ' to 
5' (antisense) . Degeneracies are shown above (PAN-UO & 
PAN-DI) or below (PAN-UI) the PGR primer sequences. 
"I" denotes the nine base 5' extension attggatcc , "-i" 

20 denotes the nine base 5' extension ctcaagctt . The capture 
and detector probes DpVl and CpVlb used in the ELOSA assay 
are shown below a representative MSRV-cpoi sequence. At 
three positions below the translated MSRV-cpol sequence 
alternative amino acids (representing "non-silent" nucleic 

25 acid variations) are shown in italics - K and Y 
substitutions were only observed in PLI-1 derived clones 
whereas R and W were encoded by a significant proportion 
of the clones irrespective of derivation. Note that DpVl 
is peroxidase labelled and that CpVlb may be biotinylated 

30 at the 5' end if streptavidin coated plates are used. The 
name of each sequence is indicated at the left of the 
figure. 

HTLVl: Human Leukaemia Virus type 1; HIVl: Human 
Immunodeficiency Virus type 1; MoMLV: Moloney-Murine 
35 Leukaemia Virus; MPMV: Mason-Pfizer Monkey Virus. ERV9 : 
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Endogenous Retrovirus 9. MSRV-cpol: Multiple Sclerosis 
associated Retrovirus conserved pol region. 

- Figure 45 shows a phylogenic tree which is 
based on the conserved amino acid region encoded by the 

5 pol gene of MSRV and of representative endogenous and 
exogenous retroviruses and DNA viruses with reverse 
transcriptase. It was generated by the U.P.G.M.A. tree 
program of Geneworks® software. 

HSRV: Human Spumaretrovirus . EIAV: Equine Infectious 

10 Aenemia Virus. BLV: Bovine Leukaemia Virus, HIVl, HIV2 : 
Human Immunodeficiency Viruses type 1 and 2. HTLVl and 
HTLV2 : Human Leukaemia Viruses type 1 and 2. F-MuLV: 
Friend-Murine Leukaemia Virus. MoMLV: Moloney-Murine 
Leukaemia Virus. BAEV: Baboon Endogenous Virus. GaLV/ 

15 Gibbon Ape Leukaemia Virus. HUMER41: Human Endogenous 
Retroviral sequence, clone 41. lAP: Intracisternal A-type 
Particle. MPMV: Mason-Pfizer Monkey Virus. HERVKIO: Human 
Endogenous Retrovirus KIO. MMTV: Mouse Mammary tumour 
Virus. HSERV9 (ERV9 database sequence) : Human sequence of 

20 Endogenous Retrovirus 9. MSRV: Multiple Sclerosis 
associated Retrovirus. SIV: Simian Immunodeficiency Virus; 
RTLV-H: Reverse Transcr iptase-Like Viral sequence H; SFV: 
Simian Foamy Virus; VISNA: Visna retrovirus; SIVl: Simian 
Immunodeficiency Virus type 1; SRV-2 : Simian Retrovirus 

25 type 2; SMRV-H: Squirrel Monkey Retrovirus H. 

- Figure 4 6 shows the MSRV sequence in the 
Protease and Reverse-Transcr iptase regions of the pol 
gene . 

The aminoacid translation is aligned under the 
30 corresponding nucleotide sequence. The region 

corresponding to the Protease ORF cloned in a recombinant 
vector and expressed in E. coli , is boxed. The regions 
corresponding to the A and B fragments amplified on plasma 
samples from MS patients are indicated by brackets. The 
35 Reverse-Transcriptase (RT) and RNase H (RNH) region is 
boxed with dotted line. The highly conserved aminoacids 
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and/or active sites of enzyme activities of both PRT and 
RT (including RNH) are shown underlined. 

- Figure 47A illustrates the pecific detection 
of MSRV-pol RNA sequence by RT-PCR in the sucrose density 

5 fraction associated with RT-activity and in MS plasma ; 
Figure 47B shows the RT-activity profile on a sucrose 
density gradient obtained with extracellular virion 
pelleted from an MS choroid-plexus culture. The photograph 
below shows an agarose gel loaded with PGR products 

10 amplified from round 1 (STl.l) RT-PCR products with the 
ST1.2 primer set. From left to right: water control 1 from 
RT-PCR step with STl.l set; water control 2 amplified from 
water control 1 with ST1.2 nested primers; Molecular 
weight markers; Fraction n°l to 10 corresponding to the 

15 RT-activity profile shown above; Plasma samples CI and C2 
from healthy blood donors. Plasma samples MSI and MS2 from 
two MS patients. 

- Figure 48 shows an example of a variant and/or 
recombined sequence in the region of the pol gene defined 

20 by homology with the overlapping regions described in 
Figure 25, as GM3 , MSRV-1 pol*, t pol and FBd3 . 

- Figure 49 shows the nucleotide (Figure 49A) 
and amino acid (Figure 49B) alignments of the pol region 
between clones 1, 5 and 8 of the same patient (Experiment 

25 46-7) . 

- Figure 50 shows the nucleotide (Figure 50A) 
and amino acid (Figure 508) alignments of the pol region 
between clones 41, 4 3 and 4 2 of the same patient 
(Experiment 68-1) . 

30 - Figure 51 shows the nucleotide (Figure 51A) 

and amino acid (Figure 51B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 17 6) of clones 
1, 5 and 8 of the same patient (Experiment 46-7) and 
SEQ ID N0:1, and between their corresponding peptide 

3 5 sequences. 
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- Figure 52 shows the nucleotide (Figure 52A) 
and amino acid (Figure 52B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 169) of clones 
41, 43 and 42 of the same patient (Experiment 68-1) and 

5 SEQ ID NO:l, and between their corresponding peptide 
sequences . 

- Figure 53 shows the nucleotide (Figure 53A) 
and amino acid (Figure 53B) alignments of the pol region 
between the consensus sequence (SEQ ID NO: 176) of clones 

10 1, 5 and 8 of the same patient (Experiment 4 6-7) and the 

consensus sequence (SEQ ID NO: 169) of clones 41, 43 and 

42 of the same patient (Experiment 68-1) . 

Table 5 (at the end of the description) shows 

the sequences obtained by RT-PCR with degenerate pol 
15 primers on sucrose density gradient fractions containing 

the peak of RT-activity or its negative control (cf 

Example 18) ; and 

Table 6 (at the end of the description) shows 

the clinical data and results of MSRV-cpol detection by 
20 "Pan-retro" PGR with specific ELOSA assay, on CSF from MS 

and control patients (cf Example 18) • 

EXAMPLE l: OBTAIKING CLONES DESIGNATED MSRV-IB 
AND MSRV-2B, DEFINING, RESPECTIVELY, A RETROVIRUS MSRV-1 
25 AND A COINFECTIVE AGENT MSRV2 , BY "NESTED" PGR AMPLIFICA- 
TION OF THE CONSERVED POL REGIONS OF RETROVIRUSES ON 
VIRION PREPARATIONS ORIGINATING FROM THE LM7PC AND PLI-2 
LINES 

A PGR technique derived from the technique 
30 published by Shih (12) was used. This technique enables 
all trace of contaminant DNA to be removed by treating all 
the components of the reaction medium with DNase. It 
concomitantly makes it possible, by the use of different 
but overlapping primers in two successive series of PGR 
35 amplification cycles, to increase the chances of amplify- 
ing a cDNA synthesized from an amount of RNA which is 
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small at the outset and further reduced in the sample by 
the spurious action of the DNAse on the RNA, In effect, 
the DNase is used under conditions of activity in excess 
which enable all trace of contaminant DNA to be removed 
5 before inactivation of this enzyme remaining in the sample 
by heating to 85*^0 for 10 minutes. This variant of the PGR 
technique described by Shih (12) was used on a cDNA 
synthesized from the nucleic acids of fractions of 
infective particles purified on a sucrose gradient 

10 according to the technique described by H. Perron (13) 
from the "POL-2" isolate (ECACC No. V92072202) produced by 
the PLI-2 line (ECACC No. 92072201) on the one hand, and 
from the MS7PG isolate (ECACC No, V93010816) produced by 
the LM7PC line (ECACC No. 93010817) on the other hand. 

15 These cultures were obtained according to the methods 
which formed the subject of the patent applications 
published under Nos WO 93/20188 and WO 93/20189. 

After cloning the products amplified by this 
technique with the TA Cloning Kit® and analysis of the 

20 sequence using an Applied Biosystems model 373A Automatic 
Sequencer, the sequences were analysed using the 
Geneworks® software on the latest available version of the 
Genebank® data bank. 

The sequences cloned and sequenced from these 

25 samples correspond, in particular, to two types of 
sequence: a first type of sequence, to be found in the 
majority of the clones (55% of the clones originating from 
the POL-2 isolates of the PLI-2 culture, and 67% of the 
clones originating from the MS7PG isolates of the LM7PC 

30 cultures) , which corresponds to a family of "pol" 
sequences closely similar to, but different from, the 
endogenous human retrovirus designated ERV-9 or HSERV-9, 
and a second type of sequence which corresponds to 
sequences very strongly homologous to a sequence 

35 attributed to another infective and/or pathogenic agent 
designated MSRV-2 . 
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The first type of sequence, representing the 
majority of the clones, consists of sequences whose 
variability enables four subfamilies of sequences to be 
defined. These subfamilies are sufficiently similar to one 
5 another for it to be possible to consider them to be 
quasi-species originating from the same retrovirus, as is 
well known for the HIV-l retrovirus (14) , or to be the 
outcome of interference with several endogenous proviruses 
coregulated in the producing cells. These more or less 

10 defective endogenous elements are sensitive to the same 
regulatory signals possibly generated by a replicative 
provirus, since they belong to the same family of 
endogenous retroviruses (15) . This new family of 
endogenous retroviruses, or alternatively this new 

15 retroviral species from which the generation of quasi- 
species has been obtained in culture, and which contains a 
consensus of the sequences described below, is designated 
MSRV-IB. 

Figure 1 presents the general consensus 

20 sequences of the sequences of the different MSRV-IB clones 
sequenced in this experiment, these sequences being 
identified, respectively, by SEQ ID NO: 3, SEQ ID NO: 4, SEQ 
ID NO: 5 and SEQ ID NO: 6. These sequences display a 
homology with respect to nucleic acids ranging from 70% to 

25 88% with the HSERV9 sequence referenced X57147 and M37638 
in the Genebank® data base. Four "consensus" nucleic acid 
sequences representative of different quasi-species of a 
possibly exogenous retrovirus MSRV-IB, or of different 
subfamilies of an endogenous retrovirus MSRV-IB, have been 

30 defined. These representative consensus sequences are 
presented in Figure 2, with the translation into amino 
acids. A functional reading frame exists for each 
subfamily of these MSRV-IB sequences, and it can be seen 
that the functional open reading frame corresponds in each 

3 5 instance to the amino acid sequence appearing on the 
second line under the nucleic acid sequence. The general 
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consensus of the MSRV-IB sequence, identified by SEQ ID 
NO: 7 and obtained by this PGR technique in the "pol" 
region, is presented in Figure 1. 

The second type of sequence representing the 
5 majority of the clones sequenced is represented by the 
sequence MSRV-2B presented in Figure 3 and identified by 
SEQ ID NO: 11. The differences observed in the sequences 
corresponding to the PGR primers are explained by the use 
of degenerate primers in mixture form used under different 

10 technical conditions. 

The MSRV-2B sequence (SEQ ID NO: 11) is suffic- 
iently divergent from the retroviral sequences already 
described in the data banks for it to be suggested that 
the sequence region in question belongs to a new infective 

15 agent, designated MSRV-2 . This infective agent would, in 
principle, on the basis of the analysis of the first 
sequences obtained, be related to a retrovirus but, in 
view of the technique used for obtaining this sequence, it 
could also be a DNA virus whose genome codes for an enzyme 

20 which incidentally possesses reverse transcriptase 
activity, as is the case, for example, with the hepatitis 
B virus, HBV (12) . Furthermore, the random nature of the 
degenerate primers used for this PGR amplification 
technique may very well have permitted, as a result of 

25 unforeseen sequence homologies or of conserved sites in 
the gene for a related enzyme, the amplification of a 
nucleic acid originating from a prokaryotic or eukaryotic 
pathogenic and/or coinfective agent (protist) . 

30 EXAMPLE 2: OBTAINING CLONES DESIGNATED MSRV-IB 

AND MSRV-2B, DEFINING A FAMILY MSRV-1 and MSRV-2/ BY 
"NESTED" PGR AMPLIFICATION OF THE CONSERVED POL REGIONS OF 
RETROVIRUSES ON PREPARATIONS OF B LYMPHOCYTES FROM A NEW 
CASE OF MS 

35 The same PGR technique, modified according to 

the technique of Shih (12), was used to amplify and 
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sequence the RNA nucleic acid material present in a 
purified fraction of virions at the peak of "LMT-like" 
reverse transcriptase activity on a sucrose gradient 
according to the technique described by H. Perron (13), 
5 and according to the protocols mentioned in Example 1, 
from a spontaneous lymphoblastoid line obtained by self- 
immortalization in culture of B lymphocytes from an MS 
patient who was seropositive for the Epstein-Barr virus 
(EBV) , after setting up the blood lymphoid cells in 

10 culture in a suitable culture medium containing a suitable 
concentration of cyclosporin A. A representation of the 
reverse transcriptase activity in the sucrose fractions 
taken from a purification gradient of the virions produced 
by this line is presented in Figure 4. Similarly, the 

15 culture supernatants of a B line obtained under the same 
conditions from a control free from MS were treated under 
the same conditions, and the assay of reverse 
transcriptase activity in the sucrose gradient fractions 
proved negative throughout (background) , and is presented 

20 in Figure 5. Fraction 3 of the gradient corresponding to 
the MS B line and the same fraction without reverse 
transcriptase activity of the non-MS control gradient were 
analysed by the same RT-PCR technique as before, derived 
from Shih (12) , followed by the same steps of cloning and 

25 sequencing as described in Example 1. 

It is particularly noteworthy that the MSRV-1 
and MSRV-2 type sequences are to be found only in the 
material associated with a peak of "LM7-like" reverse 
transcriptase activity originating from the MS B lympho- 

30 blastoid line. These sequences were not to be found with 
the material from the control (non-MS) B lymphoblastoid 
line in 26 recombinant clones taken at random. Only 
Mo-MuLV type contaminant sequences, originating from the 
commercial reverse transcriptase used for the cDNA 

35 synthesis step, and sequences without any particular 
retroviral analogy were to be found in this control, as a 
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result of the "consensus" amplification of homologous 
polymerase sequences which is produced by this PGR 
technique. Furthermore, the absence of a concentrated 
target which competes for the amplification reaction in 
5 the control sample permits the amplification of dilute 
contaminants. The difference in results is manifestly 
highly significant (chi-squared, p<0.001). 

EXAMPLE 3: OBTAINING A CLONE PSJ17, DEFINING A 
10 RETROVIRUS MSRV-1, BY REACTION OF ENDOGENOUS REVERSE 
TRANSCRIPTASE WITH A VIRION PREPARATION ORIGINATING FROM 
THE PLI-2 LINE 

This approach is directed towards obtaining 
reverse-transcribed DNA sequences from the supposedly 

15 retroviral RNA in the isolate using the reverse trans- 
criptase activity present in this same isolate. This 
reverse transcriptase activity can theoretically function 
only in the presence of a retroviral RNA linked to a 
primer tRNA or hybridized with short strands of DNA 

20 already reverse-transcribed in the retroviral particles 
(16). Thus, the obtaining of specific retroviral sequences 
in a material contaminated with cellular nucleic acids was 
optimized according to these authors by means of the 
specific enzymatic amplification of the portions of viral 

25 RNAs with a viral reverse transcriptase activity. To this 
end, the authors determined the particular physicochemical 
conditions under which this enzymatic activity of reverse 
transcription on RNAs contained in virions could be 
effective in vitro. These conditions correspond to the 

30 technical description of the protocols presented below 
(endogenous RT reaction, purification, cloning and 
sequencing) . 

The molecular approach consisted in using a 
preparation of concentrated but unpurified virion obtained 
3 5 from the culture supernatants of the PLI-2 line, prepared 
according to the following method: the culture 
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supernatants are collected twice weekly, precentr if uged at 
10,000 rpm for 30 minutes to remove cell debris and then 
frozen at -80*^C or used as they are for the following 
steps. The fresh or thawed supernatants are centrifuged on 
5 a cushion of 30% glycerol-PBS at 100,000 g (or 30,000 rpm 
in a type 45 T LKB-HITACHI rotor) for 2 h at 4^C, After 
removal of the supernatant, the sedimented pellet is taken 
up in a small volume of PBS and constitutes the fraction 
of concentrated but unpurified virion. This concentrated 

10 but unpurified viral sample was used to perform a so- 
called endogenous reverse transcription reaction, as 
described below. 

A volume of 200 ml of virion purified according 
to the protocol described above, and containing a reverse 

15 transcriptase activity of approximately 1-5 million dpm, 
is thawed at 37 °C until a liquid phase appears, and then 
placed on ice. A 5-fold concentrated buffer was prepared 
with the following components: 500 mM Tris-HCl pH 8.2; 
75 mM NaCl; 25 mM MgCl2; 75 mM DTT and 0.10% NP 40; 100 ml 

20 of 5X buffer + 25 ml of a 100 mM solution of dATP + 25 ml 
of a 100 mM solution of dTTP + 25 ml of a 100 mM solution 
of dGTP + 25 ml of a 100 mM solution of dCTP + 100 ml of 
sterile distilled water + 200 ml of the virion suspension 
(RT activity of 5 million DPM) in PBS were mixed and 

25 incubated at 42*'C for 3 hours. After this incubation, the 
reaction mixture is added directly to a buffered 
phenol/chlorof orm/ isoamyl alcohol mixture (Sigma ref. 
P 3803) ; the aqueous phase is collected and one volume of 
sterile distilled water is added to the organic phase to 

30 re-extract the residual nucleic acid material. The 
collected aqueous phases are combined, and the nucleic 
acids contained are precipitated by adding 3M sodium 
acetate pH 5.2 to 1/10 volume + 2 volumes of ethanol + 
1 ml of glycogen (Boehringer-Mannheim ref. 901 393) and 

35 placing the sample at -20<*C for 4 h or overnight at +4*^0. 
The precipitate obtained after centrif ugation is then 
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washed with 70% ethanol and resuspended in 60 ml of 
distilled water. The products of this reaction were then 
purified, cloned and sequenced according to the protocol 
which will now be described: blunt-ended DNAs with 
5 unpaired adenines at the ends were generated: a "filling- 
in" reaction was first performed: 25 ml of the previously 
purified DNA solution were mixed with 2 ml of a 2.5 mM 
solution containing, in equimolar amounts, dATP + dGTP + 
dTTP + dCTP/1 ml of T4 DNA polymerase (Boehr inger-Mannheim 

10 ref. 1004 786) / 5 ml of lOX "incubation buffer for 
restriction enzyme" ( Boehr inger-Mannheim ref. 1417 975) / 
1 ml of a 1% bovine serum albumin solution / 16 ml of 
sterile distilled water. This mixture was incubated for 
20 minutes at 11°C. 50 ml of TE buffer and 1 ml of 

15 glycogen (Boehringer-Mannheim ref. 901 393) were added 
thereto before extraction of the nucleic acids with 
phenol/chlorof orm/ isoamyl alcohol (Sigma ref. P 3803) and 
precipitation with sodium acetate as described above. The 
DNA precipitated after centr if ugat ion is resuspended in 

20 10 ml of 10 mM Tris buffer pH 7 . 5 . 5 ml of this suspension 
were then mixed with 20 -ml of 5X Tag buffer, 20 ml of 5 mM 
dATP, 1 ml (5U) of Tag DNA polymerase (AmplitaqTM) and 
54 ml of sterile distilled water. This mixture is 
incubated for 2 h at 75°C with a film of oil on the 

25 surface of the solution. The DNA suspended in the aqueous 
solution drawn off under the film of oil after incubation 
is precipitated as described above and resuspended in 2 ml 
of sterile distilled water. The DNA obtained was inserted 
into a plasmid using the TA CloningTM kit. The 2 ml of DNA 

30 solution were mixed with 5 ml of sterile distilled water, 
1 ml of a 10-fold concentrated ligation buffer "lOX 
LIGATION BUFFER", 2 ml of "pCR™ VECTOR" (25 ng/ml) and 
1 ml of "TA DNA LIGASE" . This mixture was incubated 
overnight at 12**C. The following steps were carried out 

35 according to the instructions of the TA Cloning™ kit 
(British Biotechnology) . At the end of the procedure, the 
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white colonies of recombinant bacteria (white) were picked 
out in order to be cultured and to permit extraction of 
the plasmids incorporated according to the so-called 
"miniprep" procedure (17), The plasmid preparation from 
5 each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 
sequencing of the insert, after hybridization with a 

10 primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA cloning™ kit. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 

15 (Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 37 3 A" apparatus according to 
the manufacturer's instructions. 

Discriminating analysis on the computerized data 

2 0 banks of the sequences cloned from the DNA fragments 
present in the reaction mixture enabled a retroviral type 
sequence to be revealed. The corresponding clone PSJ17 was 
completely sequenced, and the sequence obtained, presented 
in Figure 6 and identified by SEQ ID NO: 9, was analysed 

25 using the "Geneworks®" software on the updated "Genebank™" 
data banks. An identical sequence already described could 
not be found by analysis of the data banks. Only a partial 
homology with some known retroviral elements was to be 
found. The most useful relative homology relates to an 

30 endogenous retrovirus designated ERV-9, or HSERV-9 , 
according to the references (18) . 



35 
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EXAMPLE 4: PGR AMPLIFICATION OF THE NUCLEIC ACID 
SEQUENCE CONTAINED BETWEEN THE 5' REGION DEFINED BY THE 
CLONE "POL MSRV-IB" AND THE 3* REGION DEFINED BY THE CLONE 
PSJ17 

5 Five oligonucleotides, MOOl, M002-A, M003-BCD, 

P004 and POOS, were defined in order to amplify the RNA 
originating from purified POL-2 virions. Control reactions 
were performed so as to check for the presence of 
contaminants (reaction with water) . The amplification 
10 consists of an RT-PCR step according to the protocol 
described in Example 2, followed by a "nested" PGR 
according to the PGR protocol described in the document 
EP-A-0 , 569 , 272 . In the first RT-PCR cycle, the primers 
MOOl and P004 or POOS are used. In the second PGR cycle, 
15 the primers M002-A or M003-BGD and the primer P004 are 
used. The primers are positioned as follows: 

M002-A 
M003-BGD 

MOOl P004 POOS 



20 



POL-2 

< > < 

pol MSRV-IB PSJ17 



RNA 



25 

Their composition is: 
primer MOOl: GGTGITIGGICAIGG (SEQ ID NO: 20) 
primer M002-A: TTAGGGATAGGCGTGATGTCT (SEQ ID NO:21) 
primer M003-BCD: TGAGGGATAGGGGGGATGTAT (SEQ ID NO: 22) 
30 primer P004: AAGGGTTTGGCAGTAGATGAATTT (SEQ ID NO: 23) 
primer POOS: GGGTAAGGAGTGGTAGAGGTATT (SEQ ID NO: 24) 

The "nested" amplification product obtained, and 
designated M003-P004, is presented in Figure 7, and 
corresponds to the sequence SEQ ID NO: 8. 

35 
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EXAMPLE 5: AMPLIFICATION AND CLONING OF A 
PORTION OF THE MSRV-1 RETROVIRAL GENOME USING A SEQUENCE 
ALREADY IDENTIFIED, IN A SAMPLE OF VIRUS PURIFIED AT THE 
PEAK OF REVERSE TRANSCRIPTASE ACTIVITY 

5 A PGR technique derived from the technique 

published by Frohman (19) was used. The technique derived 
makes it possible, using a specific primer at the 3' end 
of the genome to be amplified, to elongate the sequence 
towards the 5' region of the genome to be analysed. This 

10 technical variant is described in the documentation of the 
firm "Clontech Laboratories Inc.", (Palo-Alto California, 
USA) supplied with its product " 5 ' -AmpliFINDERTM RACE 
Kit", which was used on a fraction of virion purified as 
described above, 

15 The specific 3 ' primers used in the kit protocol 

for the synthesis of the cDNA and the PGR amplification 
are, respectively, complementary to the following MSRV-1 
sequences : 

CDNA:TCATCCATGTACCGAAGG (SEQ ID NO: 2 5) 

20 amplification : ATGGGGTTCCCAAGTTCCCT (SEQ ID NO: 26) 

The products originating from the PGR were 
obtained after purification on agarose gel according to 
conventional methods (17) , and then resuspended in 10 ml 

25 of distilled water- Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 ' end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA GloningTM kit 
(British Biotechnology) . The 2 ml of DNA solution were 

30 mixed with 5 ml of sterile distilled water, 1 ml of a 10- 
fold concentrated ligation buffer "lOX LIGATION BUFFER", 
2 ml of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA 
LIGASE" . This mixture was incubated overnight at 12 *C. The 
following steps were carried out according to the 

35 instructions of the TA Cloning™ kit (British Bio- 
technology) . At the end of the procedure, the white 
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colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called "mini- 
prep" procedure (17). The plasmid preparation from each 
5 recombinant colony was cut with a suitable restriction 
enzyme and analysed on agarose gel. Plasmids possessing an 
insert detected under UV light after staining the gel with 
ethidium bromide were selected for sequencing of the 
insert, after hybridization with a primer complementary to 

10 the Sp6 promoter present on the cloning plasmid of the TA 
Cloning™ Kit. The reaction prior to sequencing was then 
performed according to the method recommended for the use 
of the sequencing kit "Prism ready reaction kit dye 
deoxyterminator cycle sequencing kit" (Applied Biosystems, 

15 ref. 401384), and automatic sequencing was carried out 
with an Applied Biosystems "Automatic Sequencer model 
373 A" apparatus according to the manufacturer's 
instructions . 

This technique was applied first to two 

20 fractions of virion purified as described below on sucrose 
from the "POL-2" isolate produced by the PLI-2 line on the 
one hand, and from the MS7PG isolate produced by the LM7PC 
line on the other hand. The culture supernatants are 
collected twice weekly, precentr if uged at 10,000 rpm for 

25 30 minutes to remove cell debris and then frozen at -80*C 
or used as they are for the following steps. The fresh or 
thawed supernatants are centrifuged on a cushion of 3 0% 
glycerol-PBS at 100,000 g (or 30,000 rpm in a type 45 T 
LKB-HITACHI rotor) for 2 h at 4<=»C. After removal of the 

30 supernatant, the sedimented pellet is taken up in a small 
volume of PBS and constitutes the fraction of concentrated 
but unpurified virions. The concentrated virus is then 
applied to a sucrose gradient in sterile PBS buffer (15 to 
50% weight /weight) and ultracentr if uged at 3 5,000 rpm 

35 (100,000 g) for 12 h at +4*'C in a swing-out rotor. 
10 fractions are collected, and 20 ml are withdrawn from 
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each fraction after homogenizat ion to assay the reverse 
transcriptase activity therein according to the technique 
described by H. Perron (3) . The fractions containing the 
peak of "IiM7-like" RT activity are then diluted in sterile 
5 PBS buffer and ultracentrif uged for one hour at 35,000 rpm 
(100,000 g) to sediment the viral particles. The pellet of 
purified virion thereby obtained is then taken up in a 
small volume of a buffer which is appropriate for the 
extraction of RNA. The cDNA synthesis reaction mentioned 

10 above is carried out on this RNA extracted from purified 
extracellular virion, PGR amplification according to the 
technique mentioned above enabled the clone Fl-11 to be 
obtained, whose sequence, identified by SEQ ID NO: 2, is 
presented in Figure 8 . 

15 This clone makes it possible to define, with the 

different clones previously sequenced, a region of 
considerable length (1.2 kb) representative of the "pol" 
gene of the MSRV-1 retrovirus, as presented in Figure 9. 
This sequence, designated SEQ ID NO:l, is reconstituted 

20 from different clones overlapping one another at their 
ends, correcting the artefacts associated with the primers 
and with the amplification or cloning techniques which 
would artificially interrupt the reading frame of the 
whole. This sequence will be identified below under the 

25 designation "MSRV-1 pol* region". Its degree of homology 
with the HSERV-9 sequence is shown in Figure 12. 

In Figure 9, the potential reading frame with 
its translation into amino acids is presented below the 
nucleic acid sequence. 

30 

EXAMPLE 6: DETECTION OF SPECIFIC MSRV-1 and 
MSRV-2 SEQUENCES IN DIFFERENT SAMPLES OF PLASMA 
ORIGINATING FROM PATIENTS SUFFERING FROM MS OR FROM 
CONTROLS 

3 5 A PGR technique was used to detect the MSRV-1 

and MSRV-2 genomes in plasmas obtained after taking blood 
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samples from patients suffering from MS and from non-MS 
controls onto EDTA, 

Extraction of the RNAs from plasma was performed 
according to the technique described by P. Chomzynski 
5 (2 0) , after adding one volume of buffer containing 
guanidinium thiocyanate to 1 ml of plasma stored frozen at 
-80*'C after collection. 

For MSRV-2, the PGR was performed under the same 
conditions and with the following primers: 
10 - 5' primer, identified by SEQ ID NO: 14 

5 • GTAGTTCGATGTAGAAAGCG 3 ' ; 

- 3' primer, identified by SEQ ID NO: 15 
5 • GCATCCGGCAACTGCACG 3 ' , 

However, similar results were also obtained with 
15 the following PGR primers in two successive amplifications 
by "nested" PGR on samples of nucleic acids not treated 
with DNase. 

The primers used for this first step of 
40 cycles with a hybridization temperature of 48 °G are the 
20 following: 

- 5' primer, identified by SEQ ID NO: 27 

5 ' GCCGATATGAGCCGGCATGG 3 ' , corresponding to a 
5' MSRV-2 PGR primer, for a first PGR on samples from 
patients , 

25 - 3' primer, identified by SEQ ID N0:28 

5' GCATGCGGCAACTGGACG 3', corresponding to a 3 ' 
MSRV-2 PGR primer, for a first PGR on samples from 
patients . 

After this step, 10 ml of the amplification 
3 0 product are taken and used to carry out a second, 
so-called "nested" PGR amplification with primers located 
within the region already amplified. This second step 
takes place over 3 5 cycles, with a primer hybridization 
("annealing") temperature of 50°G, The reaction volume is 
35 100 ml. 
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The primers used for this second step are the 

following: 

- 5' primer, identified by SEQ ID NO: 29 

5 ' CGCGATGCTGGTTGGAGAGC 3 ' , corresponding to a 
5 5' MSRV-2 PGR primer, for a nested PGR on samples from 
patients , 

- 3' primer, identified by SEQ ID NO: 30 

5 ' TCTCCACTCCGAATATTCCG 3 ' , corresponding to a 
3* MSRV-2 PGR primer, for a nested PGR on samples from 
10 patients. 

For MSRV-1, the amplification was performed in 
two steps. Furthermore, the nucleic acid sample is treated 
beforehand with DNase, and a control PGR without RT (AMV 
reverse transcriptase) is performed on the two 
15 amplification steps so as to verify that the RT-PCR 
amplification comes exclusively from the MSRV-1 RNA, In 
the event of a positive control without RT, the initial 
aliquot sample of RNA is again treated with DNase and 
amplified again. 

2 0 The protocol for treatment with DNase lacking 

RNAse activity is as follows: the extracted RNA is 
aliquoted in the presence of "RNAse inhibitor" 
(Boehringer-Mannheim) in water treated with DEPC at a 
final concentration of 1 mg in 10 ml; to these 10 ml, 1 ml 
25 of "RNAse-free DNAse" (Boehringer-Mannheim) and 1.2 ml of 
pH 5 buffer containing 0.1 M/1 sodium acetate and 5 mM/1 
MgS04 is added; the mixture is incubated for 15 min at 

20**C and brought to 95 °C for 1.5 min in a "thermocycler" . 

The first MSRV-1 RT-PCR step is performed 

3 0 according to a variant of the RNA amplification method as 

described in Patent Application No. EP-A-0 , 569 , 272 . In 
particular, the cDNA synthesis step is performed at 42 °C 
for one hour; the PGR amplification takes place over 
40 cycles, with a primer hybridization ("annealing") 
35 temperature of 53 The reaction volume is 100 ml. 
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The primers used for this first step are the 

following : 

- 5' primer, identified by SEQ ID NO: 16 
5 ' AGGAGTAAGGAAACCCAACGGAC 3 ' ; 

5 - 3' primer, identified by SEQ ID NO: 17 

5 ' TAAGAGTTGCACAAGTGCG 3 ' . 

After this step, 10 ml of the amplification 
product are taken and used to carry out a second, so- 
called "nested" PGR amplification with primers located 
10 within the region already amplified. This second step 
takes place over 35 cycles, with a primer hybridization 
("annealing") temperature of 53°C- The reaction volume is 
100 ml. 

The primers used for this second step are the 

15 following: 

- 5' primer, identified by SEQ ID NO: 18 
5 • TCAGGGATAGCCCCCATCTAT 3 ' ; 

- 3' primer, identified by SEQ ID NO: 19 
5 • AACCCTTTGCCACTACATCAATTT 3 ' . 

20 Figures 10 and 11 present the results of PGR in 

the form of photographs under ultraviolet light of 
ethidium bromide-impregnated agarose gels, in which an 
electrophoresis of the PGR amplification products applied 
separately to the different wells was performed. 

2 5 The top photograph (Figure 10) shows the result 

of specific MSRV-2 amplification. 

Well number 8 contains a mixture of DNA 
molecular weight markers, and wells 1 to 7 represent, in 
order, the products amplified from the total RNAs of 

30 plasmas originating from 4 healthy controls free from MS 
(wells 1 to 4) and from 3 patients suffering from MS at 
different stages of the disease (wells 5 to 7) . 

In this series, MSRV-2 nucleic acid material is 
detected in the plasma of one case of MS out of the 3 

35 tested, and in none of the 4 control plasmas. Other 
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results obtained on more extensive series confirm these 
results , 

The bottom photograph (Figure 11) shows the 
result of specific amplification by MSRV-1 "nested" 
5 RT-PCR: 

well No, 1 contains the PGR product produced 
with water alone, without the addition of AMV reverse 
transcriptase; well No. 2 contains the PGR product 
produced with water alone, with the addition of AMV 

10 reverse transcriptase; well number 3 contains a mixture of 
DNA molecular weight markers; wells 4 to 13 contain, in 
order, the products amplified from the total RNAs 
extracted from sucrose gradient fractions (collected in a 
downward direction) , on which gradient a pellet of virion 

15 originating from a supernatant of a culture infected with 
MSRV-1 and MSRV-2 was centrifuged to equilibrium according 
to the protocol described by H. Perron (13); to well 14 
nothing was applied; to wells 15 to 17, the amplified 
products of RNA extracted from plasmas originating from 3 

20 different patients suffering from MS at different stages 
of the disease were applied. 

The MSRV-1 retroviral genome is indeed to be 
found in the sucrose gradient fraction containing the peak 
of reverse transcriptase activity measured according to 

25 the technique described by H. Perron (3), with a very 
strong intensity (fraction 5 of the gradient, placed in 
well No. 8) . A slight amplification has taken place in the 
first fraction (well No. 4) , probably corresponding to RNA 
released by lysed particles which floated at the surface 

30 of the gradient; similarly, aggregated debris has 
sedimented in the last fraction (tube bottom) , carrying 
with it a few copies of the MSRV-1 genome which have given 
rise to an amplification of low intensity. 

Of the 3 MS plasmas tested in this series, MSRV- 

3 5 1 RNA turned up in one case, producing a very intense 
amplification (well No. 17). 
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In this series, the MSRV-1 retroviral RNA 
genome, probably corresponding to particles of 
extracellular virus present in the plasma in extremely 
small numbers, was detected by "nested" RT-PCR in one case 
5 of MS out of the 3 tested. Other results obtained on more 
extensive series confirm these results. 

Furthermore, the specificity of the sequences 
amplified by these PGR techniques may be verified and 
evaluated by the "ELOSA" technique as described by 
10 F. Mallet (21) and in the document FR-A-2 , 663 , 040 . 

For MSRV-1, the products of the nested PGR 
described above may be tested in two ELOSA systems 
enabling a consensus A and a consensus B+C+D of MSRV-1 to 
be detected separately, corresponding to the subfamilies 
15 described in Example 1 and Figures 1 and 2, In effect, the 
sequences closely resembling the consensus B+G+D are to be 
found essentially in the RNA samples originating from 
MSRV-1 virions purified from cultures or amplified in 
extracellular biological fluids of MS .patients, whereas 
20 the sequences closely resembling the consensus A are 
essentially to be found in normal human cellular DNA. 

The ELOSA/MSRV-1 system for the capture and 
specific hybridization of the PGR products of the 
subfamily A uses a capture oligonucleotide cpVlA with an 
25 amine bond at the 5' end and a biotinylated detection 
oligonucleotide dpVlA having as their sequence, 
respectively : 

- cpVlA identified by SEQ ID NO: 31 

5 • GATGTAGGGGACTTGTGAGGTGCAGS 3 ' , corresponding 
30 to the ELOSA capture oligonucleotide for the products of 
MSRV-1 nested PGR performed with the primers identified by 
SEQ ID NO: 16 and SEQ ID NO: 17, optionally followed by 
amplification with the primers identified by SEQ ID NO: 18 
and SEQ ID NO: 19 on samples from patients; 
35 - dpVlA identified by SEQ ID NO:32; 



RNKHOCID: <WO 9823755A1> 



wo 98/23755 PCT/IB97/01482 



5 ' CATCTITTTGGICAGGCAITAGC 3 ' , corresponding to 
the ELOSA capture oligonucleotide for the subfamily A of 
the products of MSRV-1 "nested" PGR performed with the 
primers identified by SEQ ID NO: 16 and SEQ ID NO: 17, 
5 optionally followed by amplification with the primers 
identified by SEQ ID NO: 18 and SEQ ID NO: 19 on samples 
from patients. 

The ELOSA/MSRV-1 system for the capture and 
specific hybridization of the PGR products of the 
10 subfamily B+C+D uses the same biotinylated detection 
oligonucleotide dpVlA and a capture oligonucleotide cpVlB 
with an amine bond at the 5* end having as its sequence: 

- dpVlB identified by SEQ ID NO: 33 

5 * CTTGAGCCAGTTCTCATACCTGGA 3 • , corresponding to 
15 the ELOSA capture oligonucleotide for the subfamily B + C 
+ D of the products of MSRV-1 "nested" PGR performed with 
the primers identified by SEQ ID NO: 16 and SEQ ID NO: 17, 
optionally followed by amplification with the primers 
identified by SEQ ID NO: 18 and SEQ ID NO: 19 on samples 
20 from patients. 

This ELOSA detection system enabled it to be 
verified that none of the PGR products thus amplified from 
DNase-treated plasmas of MS patients contained a sequence 
of the subfamily A, and that all were positive with the 
25 consensus of the subfamilies B, G and D. 

For MSRV-2, a similar ELOSA technique was evalu- 
ated on isolates originating from infected cell cultures , 
using the following PGR amplification primers, 

- 5' primer, identified by SEQ ID NO: 34 

30 5* AGTGYTRGGMGARGGGGGTGAA 3*, corresponding to a 

5' MSRV-2 PGR primer, for PGR on samples from cultures, 

- 3' primer, identified by SEQ ID NO: 35 

5 ' GMGGGGAGCAGSAKGTGATGGA 3 ' , corresponding to a 
3' MSRV-2 PGR primer, for PGR on samples from cultures, 
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and the capture oligonucleotides with an amine 
bond at the 5' end cpV2 and the biotinylated detection 
oligonucleotide dpV2 having as their respective sequences: 
- cpV2 identified by SEQ ID NO: 36 
5 5 GGATGCCGCCTATAGCCTCTAC 3 • , corresponding to an 

ELOSA capture oligonucleotide for the products of MSRV-2 
PGR performed with the primers SEQ ID NO: 34 and SEQ ID 
NO: 35, or optionally with the degenerate primers defined 
by Shih (12) . 
10 - dpV2 identified by SEQ ID NO: 37 

5' AAGCCTATCGCGTGCAGTTGCC 3*, corresponding to 
an ELOSA detection oligonucleotide for the products of 
MSRV-2 PGR performed with the primers SEQ ID NO: 34 and SEQ 
ID NO: 35, or optionally with the degenerate primers 
15 defined by Shih (12) 

This PGR amplification system with a pair of 
primers different from those which were described previ- 
ously for amplification on the samples from patients made 
it possible to confirm the infection with MSRV-2 of in 
20 vitro cultures and of samples of nucleic acids used for 
the molecular biology studies. 

All things considered, the first results of PGR 
detection of the genome of pathogenic and/or infective 
agents show that it is possible that free "virus" may 
25 circulate in the blood stream of patients in an acute, 
virulent phase, outside the nervous system. This is 
compatible with the almost invariable presence of "gaps" 
in the blood-brain barrier of patients in an active phase 
of MS. 

30 

EXAMPLE 7: OBTAINING SEQUENCES OF THE "env" GENE 
OF THE MSRV-1 RETROVIRAL GENOME 

As has already been described in Example 5, a 
PGR technique derived from the technique published by 
35 Frohman (19) was used. The technique derived makes it 
possible, using a specific primer at the 3» end of the 
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genome to be amplified, to elongate the sequence towards 
the 5' region of the genome to be analysed. This technical 
variant is described in the documentation of "Clontech 
Laboratories Inc., (Palo-Alto California, USA) supplied 
5 with its product "5 ' -AmpliFINDERTM RACE Kit", which was 
used on a fraction of virion purified as described above. 

In order to carry out an amplification of the 3' 
region of the MSRV-1 retroviral genome encompassing the 
region of the "env" gene, a study was carried out to 

10 determine a consensus sequence in the LTR regions of the 
same type as those of the defective endogenous retrovirus 
HSERV-9 (18, 24), with which the MSRV-1 retrovirus 
displays partial homologies. 

The same specific 3 • primer was used in the kit 

15 protocol for the synthesis of the cDNA and the PCR 
amplification; its sequence is as follows: 

GTGCTGATTGGTGTATTTACAATCC (SEQ ID NO 45) 
Synthesis of the complementary DNA (cDNA) and 
unidirectional PCR amplification with the above primer 

20 were carried out in one step according to the method 
described in Patent EP-A-0 , 569 , 272 . 

The products originating from the PCR were 
extracted after purification of agarose gel according to 
conventional methods (17), and then resuspended in 10 ml 

25 of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 ' end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 
Biotechnology) . The 2 ml of DNA solution were mixed with 5 

30 ml of sterile distilled water, 1 ml of a 10-fold 
concentrated ligation buffer "lOX LIGATION BUFFER", 2 ml 
of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA LIGASE" . 
This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 

35 instructions of the TA Cloning® kit (British Biotechno- 
logy) . At the end of the procedure, the white colonies of 
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recombinant bacteria (white) were picked out in order to 
be cultured and to permit extraction of the plasmids 
incorporated according to the so-called "miniprep" 
procedure (17) . The plasmid preparation from each 
5 recombinant colony was cut with a suitable restriction 
enzyme and analysed on agarose gel, Plasmids possessing an 
insert detected under UV light after staining the gel with 
ethidium bromide were selected for sequencing of the 
insert, after hybridization with a primer complementary to 

10 the Sp6 promoter present on the cloning plasmid of the TA 
Cloning™ Kit. The reaction prior to sequencing was then 
performed according to the method recommended for the use 
of the sequencing kit "Prism ready reaction kit dye 
deoxyterminator cycle sequencing kit" (Applied Biosystems, 

15 ref. 401384), and automatic sequencing was carried out 
with an Applied Biosystems "automatic sequencer, model 
373 A" apparatus according to the manufacturer's 
instructions . 

This technical approach was applied to a sample 

20 of virion concentrated as described below from a mixture 
of culture supernatants produced by B lymphoblastoid lines 
such as are described in Example 2, established from 
lymphocytes of patients suffering from MS and possessing 
reverse transcriptase activity which is detectable 

25 according to the technique described by Perron et al. (3): 
the culture supernatants are collected twice weekly, 
precentrif uged at 10,000 rpm for 30 minutes to remove cell 
debris and then frozen at -80 °C or used as they are for 
the following steps. The fresh or thawed supernatants are 

30 centrifuged on a cushion of 30% glycerol-PBS at 100,000 g 
for 2 h at 4^C. After removal of the supernatant, the 
sedimented pellet constitutes the sample of concentrated 
but unpurified virions. The pellet thereby obtained is 
then taken up in a small volume of an appropriate buffer 

35 for the extraction of RNA. The cDNA synthesis reaction 
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mentioned above is carried out on this RNA extracted from 
concentrated extracellular virion - 

RT-PCR amplification according to the technique 
mentioned above enabled the clone FBd3 to be obtained, 
5 whose sequence, identified by SEQ ID NO: 46, is presented 
in Figure 13 . 

In Figure 14 , the sequence homology between the 
clone FBd3 and the HSERV-9 retrovirus is shown on the 
matrix chart by a continuous line for any partial homology 

10 greater than or equal to 65%. It can be seen that there 
are homologies in the flanking regions of the clone (with 
the pol gene at the 5* end and with the env gene and then 
the LTR at the 3 ' end) , but that the internal region is 
totally divergent and does not display any homology, even 

15 weak, with the "env" gene of HSERV9 . Furthermore, it is 
apparent that the clone FBd3 contains a longer "env" 
region than the one which is described for the defective 
endogenous HSERV-9 ; it may thus be seen that the internal 
divergent region constitutes an "insert" between the 

20 regions of partial homology with the HSERV-9 defective 
genes. 

EXAMPLE 8: AMPLIFICATION, CLONING AND SEQUENCING 
OF THE REGION OP THE MSRV-1 RETROVIRAL GENOME LOCATED 
2 5 BETWEEN THE CLONES PSJ17 AND FBdS 

Four oligonucleotides, Fl, B4 , F6 and Bl, were 
defined for amplifying RNA originating from concentrated 
virions of the strains P0L2 and MS7PG. Control reactions 
were performed so as to check for the presence of 

30 contaminants (reaction with water) . The amplification 
consists of a first step of RT-PCR according to the 
protocol described in Patent Application EP-A-0 , 569 , 272 , 
followed by a second step of PCR performed on 10 ml of 
product of the first step with primers internal to the 

35 amplified first region ("nested" PCR) . In the first RT-PCR 
cycle, the primers Fl and 84 are used. In the second PCR 
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cycle, the primers F6 and the primer Bl are used. The 
primers are positioned as follows: 

Fl F6 Bl 84 



5 RNA 

MSRV-1 

PSJ17 FBd3 
> < 

10 5'pol MSRV-1 3' pol MSRV-1 

5 • env 

Their composition is: 

primer Fl: TGATGTGAACGGCATACTCACTG ( SEQ ID NO: 47) 
15 primer B4 : CCCAGAGGTTAGGAACTCCCTTTC (SEQ ID NO 48) 

primer F6: GCTAAAGGAGACTTGTGGTTGTCAG (SEQ ID NO 49) 

primer Bl: CAACATGGGCATTTCGGATTAG (SEQ ID NO 50) 

The product of "nested" amplification obtained 

and designated "t pol" is presented in Figure 15, and 
2 0 corresponds to the sequence SEQ ID NO: 51. 

EXAMPLE 9: OBTAINING NEW SEQUENCES, EXPRESSED AS 
RNA IN CELLS IN CULTURE PRODUCING MSRV-1, AND COMPRISING 
AN "env" REGION OF THE MSRV-1 RETROVIRAL GENOME 

25 A library of cDNA was produced according to the 

procedure described by the manufacturer of the "cDNA 
synthesis module, cDNA rapid adaptator ligation module, 
cDNA rapid cloning module and lambda gtlO in vitro 
packaging module" kits (Amersham, ref RPN1256Y/Z, RPN1712, 

30 RPN1713, RPN1717, N334Z), from the messenger RNA extracted 
from cells of a B lymphoblastoid line such as is described 
in Example 2, established from the lymphocytes of a 
patient suffering from MS and possessing reverse 
transcriptase activity which is detectable according to 

35 the technique described by Perron et al. (3). 



/ 
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Oligonucleotides were defined for amplifying the 
cDNA cloned into the nucleic acid library between the 3* 
region of the clone PSJ17 (pol) and the 5'(LTR) region of 
the clone FBd3 • Control reactions were performed so as to 
5 check for the presence of contaminants (reaction with 
water) . PGR reactions performed on the nucleic acids 
cloned into the library with different pairs of primers 
enabled a series of clones linking pol sequences to the 
MSRV-1 type env or LTR sequences to be amplified. 
10 Two clones are representative of the sequences 

obtained in the cellular cDNA library: 

- the clone JLBcl, whose sequence SEQ ID NO: 52 is pre- 
sented in Figure 16; 

- the clone JLBc2 , whose sequence SEQ ID NO: 53 is pre- 
15 sented in Figure 17 . 

The sequences of the clones JLBcl and JLBc2 are 
homologous to that of the clone FBd3 , as is apparent in 
Figures 18 and 19. The homology between the clone JLBcl 
and the clone JLBe2 is shown in Figure 20. 

20 The homologies between the clones JLBcl and 

JLBc2 on the one hand and the HSERV9 sequence on the other 
hand are presented, respectively, in Figures 21 and 22. 

It will be noted that the region of homology 
between JLBl, JLB2 and FBd3 comprises, with a few sequence 

25 and size variations of the "insert", the additional 
sequence absent ("inserted") in the HSERV-9 env sequence, 
as described in Example 8. 

It will also be noted that the cloned "pol" 
region is very homologous to HSERV-9, does not possess a 

3 0 reading frame (bearing in mind the sequence errors induced 
by the techniques used, including even the automatic 
sequencer) and diverges from the MSRV-1 sequences obtained 
from virions. In view of the fact that these sequences 
were cloned from the RNA of cells expressing MSRV-l 

35 particles, it is probable that they originate from 
endogenous retroviral elements related to the ERV9 family; 
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this is all the more likely for the fact that the pel and 
env genes are present on the same RNA which is clearly not 
the MSRV-1 genomic RNA. Some of these ERV9 elements 
possess functional LTRs which can be activated by 
5 replicative viruses coding for homologous or heterologous 
transactivators. Under these conditions, the relationship 
between MSRV-1 and HSERV-9 makes probable the 
transactivation of the defective (or otherwise) endogenous 
ERV9 elements by homologous, or even identical, MSRV-1 

10 transactivating proteins. 

Such a phenomenon may induce a viral interfer- 
ence between the expression of MSRV-1 and the related 
endogenous elements. Such an interference generally leads 
to a so-called "defective-interfering" expression, some 

15 features of which were to be found in the MSRV-l-inf ected 
cultures studied* Furthermore, such a phenomenon does not 
lack generation of the expression of polypeptides, or even 
of endogenous retroviral proteins which are not 
necessarily tolerated by the immune system. Such a scheme 

20 of aberrant expression of endogenous elements related to 
MSRV-1 and induced by the latter is liable to multiply the 
aberrant antigens, and hence to contribute to the 
induction of autoimmune processes such as are observed in 
MS. 

25 It is, however, essential to note that the 

clones JLBcl and JLBc2 differ from the ERV9 or HSERV9 
sequence already described, in that they possess a longer 
env region comprising an additional region totally 
divergent from ERV9 . Their kinship with the endogenous 

30 ERV9 family may hence be defined, but they clearly 
constitute novel elements never hitherto described. In 
effect, interrogation of the data banks of nucleic acid 
sequences available in version No- 15 (1995) of the 
"Entrez" software (NCBI, NIH, Bethesda, USA) did not 

3 5 enable a known homologous sequence in the env region of 
these clones to be identified. 
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EXAMPLE 10: OBTAINING SEQUENCES LOCATED IN THE 
5' pol AND 3* gag REGION OF THE MSRV-1 RETROVIRAL GENOME 

As has already been described in Example 5, a 
5 PGR technique derived from the technique published by 
Frohman (19) was used. The technique derived makes it 
possible, using a specific primer at the 3' end of the 
genome to be amplified, to elongate the sequence towards 
the 5' region of the genome to be analysed. This technical 

10 variant is described in the documentation of the firm 
Clontech Laboratories Inc., (Palo-Alto California, USA) 
supplied with its product " 5 • -AmpliFINDER™ RACE Kit", 
which was used on a fraction of virion purified as 
described above. 

15 In order to carry out an amplification of the 5' 

region of the MSRV-1 retroviral genome starting from the 
pol sequence already sequenced (clone Fll-1) and extending 
towards the gag gene, MSRV-l specific primers were 
def ined . 

20 The specific 3' primers used in the kit protocol 

for the synthesis of the cDNA and the PCR amplification 
are, respectively, complementary to the following MSRV-1 
sequences: 

cDNA: (SEQ ID NO: 54) 

2 5 CCTGAGTTCTTGCACTAACCC 

amplification: (SEQ ID NO: 55) 
GTCCGTTGGGTTTCCTTACTCCT 

The products originating from the PCR were 
extracted after purification on agarose gel according to 

3 0 conventional methods (17), and then resuspended in 10 ml 

of distilled water. Since one of the properties of Tag 
polymerase consists in adding an adenine at the 3 ' end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 
35 Biotechnology) . The 2 ml of DNA solution were mixed with 5 
ml of sterile distilled water, 1 ml of a 10-fold 



BNSDOCID: <WO 9823755A1> 



wo 98/23755 CT/IB97/01482 



concentrated ligation buffer "lOX LIGATION BUFFER", 2 ml 
of "pCR™ VECTOR" (25 ng/ml) and 1 ml of "TA DNA LIGASE" . 
This mixture was incubated overnight at 12 °C. The 
following steps were carried out according to the 
5 instructions of the TA Cloning® kit (British 
Biotechnology) . At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 

10 "miniprep" procedure (17) . The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 

15 sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning*^^ Kit. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit "Prism ready 

20 reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"automatic sequencer model 373 A" apparatus according to 
the manufacturer's instructions. 

25 This technical approach was applied to a sample 

of virion concentrated as described below from a mixture 
of culture supernatants produced by B lymphoblastoid lines 
such as are described in Example 2, established from 
lymphocytes of patients suffering from MS and possessing 

30 reverse transcriptase activity which is detectable 
according to the technique described by Perron et al. (3): 
the culture supernatants are collected twice weekly, 
precentrifuged at 10,000 rpm for 30 minutes to remove cell 
debris and then frozen at -80**C or used as they are for 

35 the following steps. The fresh or thawed supernatants are 
centrifuged on a cushion of 30% glycerol-PBS at 100,000 g 
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for 2 h at A'^C. After removal of the supernatant, the 
sedimented pellet constitutes the sample of concentrated 
but unpurified virions. The pellet thereby obtained is 
then taken up in a small volume of an appropriate buffer 
5 for the extraction of RNA. The cDNA synthesis reaction 
mentioned above is carried out on this RNA extracted from 
concentrated extracellular virion. 

RT-PCR amplification according to the technique 
mentioned above enabled the clone GM3 to be obtained, 

10 whose sequence, identified by SEQ ID NO 56, is presented 
in Figure 23. 

In Figure 24, the sequence homology between the 
clone GMP3 and the HSERV-9 retrovirus is shown on the 
matrix chart by a continuous line, for any partial 

15 homology greater than or equal to 65%. 

In summary. Figure 2 5 shows the localization of 
the different clones studied above, relative to the known 
ERV9 genome. In Figure 25, since the MSRV-1 env region is 
longer than the reference ERV9 env gene, the additional 

20 region is shown above the point of insertion according to 
a "V", on the understanding that the inserted material 
displays a sequence and size vari-ability between the 
clones shown (JLBcl, JLBc2 , FBd3). And Figure 26 shows the 
position of different clones studied in the MSRV-1 pol* 

25 region. 

By means of the clone GM3 described above, a 
possible reading frame could be defined, covering the 
whole of the pol gene, referenced according to SEQ ID 
NO: 57, shown in the successive Figures 27a to 27c. 

30 

EXAMPLE 11: DETECTION OF ANTI-MSRV-1 SPECIFIC 
ANTIBODIES IN HUMAN SERUM 

Identification of the sequence of the pol gene 
of the MSRV-1 retrovirus and of an open reading frame of 
35 this gene enabled the amino acid sequence SEQ ID NO: 39 of 
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a region of the said gene, referenced SEQ ID NO: 40, to be 
determined (see Figure 28) . 

Different synthetic peptides corresponding to 
fragments of the protein sequence of MSRV-1 reverse 
5 transcriptase encoded by the pol gene were tested for 
their antigenic specificity with respect to sera of 
patients suffering from MS and of healthy controls. 

The peptides were synthesized chemically by 
solid-phase synthesis according to the Merrifield tech- 

10 nique (Barany G, and Merrifielsd R.B, 1980, In the 
Peptides, 2, 1-284, Gross E and Meienhofer J, Eds., 
Academic Press, New York) . The practical details are those 
described below. 

a) Peptide synthesis: 

15 The peptides were synthesized on a phenylacet- 

amidomethyl (PAM) /polystyrene/diviny Ibenzene resin 

(Applied Biosystems, Inc. Foster City, CA) , using an 
"Applied Biosystems 430A" automatic synthesizer. The amino 
acids are coupled in the form of hydroxybenzotriazole 

20 (HOST) esters. The amino acids used are obtained from 
Novabiochem (Lauf lerlf ingen, Switzerland) or Bachem 
(Bubendorf , Switzerland) . 

The chemical synthesis was performed using a 
double coupling protocol with N-methy Ipyrrolidone (NMP) as 

25 solvent. The peptides were cut from the resin, as well as 
the side-chain protective groups, simultaneously, using 
hydrofluoric acid (HF) in a suitable apparatus (type I 
cleavage apparatus, Peptide Institute, Osaka, Japan) . 

For 1 g of peptidyl resin, 10 ml of HF, 1 ml of 

30 anisole and 1 ml of dimethyl sulphide 5DMS are used. The 
mixture is stirred for 45 minutes at -2*=*C. The HF is then 
evaporated off under vacuum. After intensive washes with 
ether, the peptide is eluted from the resin with 10% 
acetic acid and then lyophilized. 

3 5 The peptides are purified by preparative high 

performance liquid chromatography on a VYDAC C18 type 
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column (250 x 21 mm) (The Separation Group, Hesperia, CA, 
USA) . Elution is carried out with an acetonitrile gradient 
at a flow rate of 22 ml/min. The fractions collected are 
monitored by an elution under isocratic conditions on a 
5 VYDAC® C18 analytical column (250 x 4.6 mm) at a flow rate 
of 1 ml/min- Fractions having the same retention time are 
pooled and lyophilized. The preponderant fraction is then 
analysed by analytical high performance liquid 
chromatography with the system described above. The 

10 peptide which is considered to be of acceptable purity 
manifests itself in a single peak representing not less 
than 9 5% of the chromatogram . 

The purified peptides are then analysed with the 
object of monitoring their amino acid composition, using 

15 an Applied Biosystems 420H automatic amino acid analyser, 
Measurement of the (average) chemical molecular mass of 
the peptides is obtained using LSIMS mass spectrometry in 
the positive ion mode on a VG, ZAB.ZSEQ double focusing 
instrument connected to a DEC-VAX 2000 acquisition system 

20 (VG analytical Ltd, Manchester, England) . 

The reactivity of the different peptides was 
tested against sera of patients suffering from MS and 
against sera of healthy controls. This enabled a peptide 
designated P0L2B to be selected, whose sequence is shown 

25 in Figure 28 in the identifier SEQ ID NO: 39, below, 
encoded by the pol gene of MSRV-1 (nucleotides 181 to 
330) • 

b) Antigenic properties: 

The antigenic properties of the P0L2B peptide 
30 were demonstrated according to the ELISA protocol 
described below. 

The lyophilized P0L2B peptide was dissolved in 
sterile distilled water at a concentration of 1 mg/ml. 
This stock solution was aliquoted and kept at +4°C for use 
35 over a fortnight, or frozen at -20°C for use within 2 
months. An aliquot is diluted in PBS (phosphate buffered 
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saline) solution so as to obtain a final peptide 
concentration of 1 microgram/ml . 100 microlitres of this 
dilution are placed in each well of microtitration plates 
("high-binding" plastic, COSTAR ref: 3590). The plates are 
5 covered with a "plate-sealer" type adhesive and kept 
overnight at +4*^0 for the phase of adsorption of the 
peptide to the plastic. The adhesive is removed and the 
plates are washed three times with a volume of 300 micro- 
litres of a solution A (IX PBS, 0.05% Tween 2 0cs>) , then 

10 inverted over an absorbent tissue. The plates thus drained 
are filled with 200 microlitres per well of a solution B 
(solution A + 10% of goat serum) , then covered with an 
adhesive and incubated for 45 minutes to 1 hour at 37«C. 
The plates are then washed three times with the solution A 

15 as described above. 

The test serum samples are diluted beforehand to 
1/50 in the solution B, and 100 microlitres of each dilute 
test serum are placed in the wells of each microtitration 
plate. A negative control is placed in one well of each 

20 plate, in the form of 100 microlitres of buffer B. The 
plates covered with an adhesive are then incubated for 1 
to 3 hours at 37 The plates are then washed three times 
with the solution A as described above. In parallel, a 
peroxidase-labelled goat antibody directed against human 

25 IgG (Sigma Immunochemicals ref. A6029) or IgM (Cappel ref. 
55228) is diluted in the solution B (dilution 1/5000 for 
the anti-IgG and I/IOOO for the anti-IgM) . 100 microlitres 
of the appropriate dilution of the labelled antibody are 
then placed in each well of the microtitration plates, and 

30 the plates covered with an adhesive are incubated for 1 to 
2 hours at 37 °C. A further washing of the plates is then 
performed as described above. In parallel, the peroxidase 
substrate is prepared according to the directions of the 
"Sigma fast OPD kit" (Sigma Immunochemicals, ref. P9187) • 

35 100 microlitres of substrate solution are placed in each 
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well, and the plates are placed protected from light for 
2 0 to 30 minutes at room temperature. 

When the colour reaction has stabilized, the 
plates are placed immediately in an ELISA plate 
5 spectrophotometric reader, and the optical density (OD) of 
each well is read at a wavelength of 492 nm* Alter- 
natively, 30 microlitres of IN HCl are placed in each well 
to stop the reaction, and the plates are read in the 
spectrophotometer within 24 hours. 

10 The serological samples are introduced in dupli- 

cate or in triplicate, and the optical density (OD) 
corresponding to the serum tested is calculated by taking 
the mean of the OD values obtained for the same sample at 
the same dilution. 

15 The net OD of each serum corresponds to the mean 

OD of the serum minus the mean OD of the negative control 
(solution B: PBS, 0.05% Tween 20®, 10% goat serum). 

c) Detection of anti-MSRV-1 IgG antibodies by 

ELISA: 

20 The technique described above was used with the 

POLB2 peptide to test for the presence of anti-MSRV-1 
specific. IgG antibodies in the serum of 29 patients for 
whom a definite or probable diagnosis of MS was estab- 
lished according to the criteria of Poser (23) , and of 32 

25 healthy controls (blood donors) . 

Figure 29 shows the results for each serum 
tested with an anti-IgG antibody. Each vertical bar 
represents the net optical density (OD at 492 nm) of a 
serum tested. The ordinate axis gives the net OD at the 

30 top of the vertical bars. The first 29 vertical bars lying 
to the left of the vertical broken line represent the sera 
of 29 cases of MS tested, and the 32 vertical bars lying 
to the right of the vertical broken line represent the 
sera of 32 healthy controls (blood donors) . 

35 The mean of the net OD values for the MS sera 

tested is 0.62. The diagram enables 5 controls to be 
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revealed whose net OD rises above the grouped values of 
the control population. These values may represent the 
presence of specific IgGs in symptomless seropositive 
patients. Two methods were hence evaluated in order to 
5 determine the statistical threshold of positivity of the 
test. 

The mean of the net OD values for the controls, 
including the controls with high net OD values, is 
Without the 5 controls whose net OD values are greater 
10 than or equal to 0.5, the mean of the "negative" controls 
is 0.33- The standard deviation of the negative controls 
is 0.10, A theoretical threshold of positivity may be 
calculated according to the formula: 

threshold value (mean of the net OD values of the 
15 seronegative controls) + (2 or 3 x standard deviation of 

the net OD values of the seronegative controls) . 

In the first case, there are considered to be 

symptomless seropositives, and the threshold value is 

equal to 0.33 + (2 x 0. 10) = 0.53. The negative results 
20 represent a non-specific "background" of the presence of 

antibodies directed specifically against an epitope of the 

peptide. 

In the second case, if the set of controls 
consisting of blood donors in apparent good health is 

25 taken as a reference basis, without excluding the sera 
which are, on the face of it, seropositive, the standard 
deviation of the "non-MS controls" is 0.116. The threshold 
value then becomes 0.36 + (2 x 0.116) = 0.59. 

According to this analysis, the test is specific 

30 for MS. In this respect, it is seen that the test is 
specific for MS, since, as shown in Table 1, no control 
has a net OD above this threshold. In fact, this result 
reflects the fact that the antibody titres in patients 
suffering from MS are, for the most part, higher than in 

35 healthy controls who have been in contact with MSRV-1. 
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TABLE No, 1 



MS CONTROLS 

0,681 0.3515 

1.0425 0.56 

5 0.5675 0.3565 

0,63 0,449 

0.588 0.2825 

0.645 0.55 

0.6635 0.52 

10 0.576 0.2535 

0,7765 0,55 

0.5745 0.51 

0,513 0.426 

0.4325 0.451 

15 0.7255 0.227 

0.859 0.3905 

0.6435 0.265 

0.5795 0.4295 

0.8655 0.291 

20 0,671 0.347 

0.596 0.4495 

0.662 0.3725 

0.602 0.181 

0.525 0.2725 

25 0.53 0,426 

0.565 0.1915 

0.517 0.222 

0.607 0.395 

0.3705 0.34 

30 0.397 0.307 

0.4395 0.219 

0 .491 
0 .2265 
0 , 2605 

35 MEAN 0.62 0.33 

STD DEV 0 . 14 0 . 10 

THRESHOLD VALUE 0,53 
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In accordance with the first method of calcula- 
tion, and as shown in Figure 29 and in the corresponding 
Table 1, 26 of the 29 MS sera give a positive result (net 
OD greater than or equal to 0.50), indicating the presence 
5 of IgGs specifically directed against the POL2B peptide, 
hence against a portion of the reverse transcriptase 
enzyme of the MSRV-1 retrovirus encoded by its pol gene, 
and consequently against the MSRV-1 retrovirus. Thus, 
approximately 90% of the MS patients tested have reacted 

10 against an epitope carried by the POL2B peptide and 
possess circulating IgGs directed against the latter. 

Five out of 3 2 blood donors in apparent good 
health show a positive result. Thus, it is apparent that 
approximately 15% of the symptomless population may have 

15 been in contact with an epitope carried by the POL2B 
peptide under conditions which have led to an active 
immunization which manifests itself in the persistence of 
specific serum IgGs. These conditions are compatible with 
an immunization against the MSRV-1 retrovirus reverse 

20 transcriptase during an infection with (and/or reactiva- 
tion of) the MSRV-1 retrovirus. The absence of apparent 
neurological pathology recalling MS in these seropositive 
controls may indicate that they are healthy carriers and 
have eliminated an infectious virus after immunizing 

25 themselves, or that they constitute an at-risk population 
of chronic carriers. In effect, epidemiological data 
showing that a pathogenic agent present in the environment 
of regions of high prevalence of MS may be the cause of 
this disease imply that a fraction of the population free 

30 from MS has necessarily been in contact with such a 
pathogenic agent. It has been shown that the MSRV-1 
retrovirus constitutes all or part of this "pathogenic 
agent" at the source of MS, and it is hence normal for 
controls taken from a healthy population to possess IgG 

35 type antibodies against components of the MSRV-1 
retrovirus. Thus, the difference in seroprevalence between 
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the MS and control populations is extremely significant: 
"chi-squared" test, p < 0.001. These results hence point 
to an aetiopathogenic role of MSRV-1 in MS. 

d) Detection of anti-MSRV-1 IgM antibodies by 

5 ELISA: 

The ELISA technique with the POL2B peptide was 
used to test for the presence of anti-MSRV-1 IgM specific 
antibodies in the serum of 3 6 patients for whom a definite 
or probable diagnosis of MS was established according to 

10 the criteria of Poser (23) , and of 42 healthy controls 
(blood donors) . 

Figure 30 shows the results for each serum tested 
with an anti-IgM antibody. Each vertical bar represents 
the net optical density (OD at 492 nm) of a serum tested. 

15 The ordinate axis gives the net OD at the top of the 
vertical bars. The first 36 vertical bars lying to the 
left of the vertical line cutting the abscissa axis 
represent the sera of 3 6 cases of MS tested, and the 
vertical bars lying to the right of the vertical broken 

20 line represent the sera of 42 healthy controls (blood 
donors) , The horizontal line drawn in the middle of the 
diagram represents a theoretical threshold defining the 
boundary of the positive results (in which the top of the 
bar lies above) and the negative results (in which the top 

25 of the bar lies below) . 

The mean of the net OD values for the MS cases 

tested is 0.19. 

The mean of the net OD values for the controls 

is 0.09. 

30 The standard deviation of the negative controls 

is 0.05. 

In view of the small difference between the mean 
and the standard deviation of the controls, the threshold 
of theoretical positivity may be calculated according to 
3 5 the formula: 
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threshold value = (mean of the net OD values of 
the seronegative controls) + (3 x standard deviation of 
the net OD values of the seronegative controls) . 

5 The threshold value is hence equal to 0.09 + 

(3 X 0.05) = 0*26; or, in practice, 0.25. 

The negative results represent a non-specific 
"background" of the presence of antibodies directed 
specifically against an epitope of the peptide. 

10 According to this analysis, and as shown in 

Figure 3 0 and in the corresponding Table 2, the IgM test 
is specific for MS, since no control has a net OD above 
the threshold, 7 of the 3 6 MS sera produce a positive IgM 
result; now, a study of the clinical data reveals that 

15 these positive sera were taken during a first attack of MS 
or an acute attack in untreated patients. It is known that 
IgMs directed against pathogenic agents are produced 
during primary infections or during reactivations follow- 
ing a latency phase of the said pathogenic agent. 

20 The difference in seroprevalence between the MS 

and control populations is extremely significant: 
"chi-squared" test, p < 0.001, 

These results point to an aet iopathogenic role 
of MSRV-1 in MS. 

25 The detection of IgM and IgG antibodies against 

the POL2B peptide enables the course of an MSRV-1 infec- 
tion and/or of the viral reactivation of MSRV-1 to be 
evaluated . 
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TABLE No. 



71 



10 



15 



20 



25 



30 



35 



40 



MS 

0 . 064 
0 . 087 
0 . 044 
0 . 115 
0 . 089 
025 
097 
108 
0 . 018 
0.234 
0 .274 
. 225 
.314 
. 522 
306 
143 
375 
142 
0 . 157 
0 . 168 
, 051 
104 
187 
. 044 
.053 
. 153 
.07 
.033 
. 104 
.187 
,044 
053 
153 
0 . 07 
0 . 033 
0 . 973 



0 
0 
0 



0 
0 
0 . 
0 . 
0 , 
0. 
0 . 



1 , 

0 . 
0 , 
0 . 
0 , 
0 . 
0 . 
0 . 
0 , 
0 . 
0 
0 
0 , 



CONTROLS 

0 . 243 
0 . 11 
0 . 098 
0 . 028 
0 . 094 
0 . 038 
0 . 176 

0 . 146 

0 . 049 

0 . 161 

0 . 113 

0 . 079 

0.093 

0 , 127 

0 . 02 

0 . 052 

0 .062 

0 . 074 

0 . 043 

0 . 046 

0 , 041 

0,13 

0.153 

0.107 

0 . 178 

0 , 114 

0 . 078 
0 . 118 
0 . 177 
0 . 026 
0 . 024 
0 . 046 
0 . 116 
0 . 04 
0 . 028 
0 . 073 
0 . 008 
0 . 074 
0.141 
0 .219 
0 . 047 
0 . 017 



4 5 MEAN 0.19 

STD. DEV. 0.23 
THRESHOLD VALUE 



0 . 09 
0 . 05 
0 , 26 
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e) Search for immunodominant epitopes in the 
POL2B peptide: 

In order to reduce the non-specific background 
and to optimize the detection of the responses of the 
5 anti-MSRV-1 antibodies, the synthesis of octapeptides , 
advancing in successive one amino acid steps, covering the 
whole of the sequence determined by POL2B, was carried out 
according to the protocol described below. 

The chemical synthesis of overlapping octapep- 
10 tides covering the amino acid sequence 61-110 shown in the 
identifier SEQ ID NO: 39 was carried out on an activated 
cellulose membrane according to the technique of BERG et 
al. (1989. J. Ann. Chem. Soc. , 111, 8024-8026) marketed by 
Cambridge Research Biochemicals under the trade name 
15 Spotscan. This technique permits the simultaneous 
synthesis of a large number of peptides and their 
analysis . 

The synthesis is carried out with esterified 
amino acids in which the a-amino group is protected with 

20 an FMOC group (Nova Biochem) and the side-chain groups 
with protective groups such as trityl, t-butyl ester or t- 
butyl ether. The esterified amino acids are solubilized in 
N-methylpyrrolidone (NMP) at a concentration of 300 nM, 
and 0.9 ml are applied to spots of deposit of bromophenol 

25 blue. After incubation for 15 minutes, a further 
application of amino acids is carried out according to 
another iS-minute incubation. If the coupling between two 
amino acids has taken place correctly, a coloration 
modification (change from blue to yellow-green) is 

30 observed. After three washes in DMF, an acetylation step 
is performed with acetic anhydride. Next, the terminal 
amino groups of the peptides in the process of synthesis 
are deprotected with 20% pyridine in DMF. The spots of 
deposit are restained with a 1% solution of bromophenol 

35 blue in DMF, washed three times with methanol and dried. 
This set of operations constitutes one cycle of addition 
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of an amino acid, and this cycle is repeated until the 
synthesis is complete. When all the amino acids have been 
added, the NH2 -terminal group of the last amino acid is 
deprotected with 2 0% piper idine in DMF and acetylated with 
5 acetic anhydride. The groups protecting the side chain are 
removed with a dichloromethane/ trif luoroacetic 

acid/triisobutylsilane (5 ml/5 ml/250 ml) mixture. The 
immunoreactivity of the peptides is then tested by ELISA. 

After synthesis of the different octapeptides in 

10 duplicate on two different membranes, the latter are 
rinsed with methanol and washed in TBS (O.IM Tris pH 7.2), 
then incubated overnight at room temperature in a 
saturation buffer. After several washes in TBS-T (O.IM 
Tris pH 7.2 - 0.05% Tween 20), one membrane is incubated 

15 with a 1/50 dilution of a reference serum originating from 
a patient suffering from MS, and the other membrane with a 
1/50 dilution of a pool of sera of healthy controls- The 
membranes are incubated for 4 hours at room temperature. 
After washes with TBS-T, a p-galactosidase-labelled anti- 

20 human immunoglobulin conjugate (marketed by Cambridge 
Research Biochemicals) is added at a dilution of 1/200, 
and the mixture is incubated for two hours at room 
temperature. After washes of the membranes with 0.05% TBS- 
T and PBS, the immunoreactivity in the different spots is 

25 visualized by adding 5-bromo-4-chloro-3-indolyl p-D- 
galactopyranoside in potassium. The intensity of 
coloration of the spots is estimated gualitatively with a 
relative value from 0 to 5 as shown in the attached 

Figures 31 to 33. 

30 In this way, it is possible to determine two 

immunodominant regions at each end of the POL2B peptide, 
corresponding, respectively, to the amino acid sequences 
65-75 (SEQ ID N0:41) and 92-109 (SEQ ID NO:42), according 
to Figure 34, and lying, respectively, between the 

3 5 octapeptides Phe-Cys-Ile-Pro-Val-Arg-Pro-Asp (FCIPVRPD) 
and Arg-Pro-Asp-Ser-Gln-Phe-Leu-Phe (RPDSQFLF) , and 
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Thr-Val-Leu-Pro-Gln-Gly-Phe-Arg (TVLPQGFR) and Leu-Phe- 
Gly-Gln-Ala-Leu-Ala-Gln (LFGQAIAQ) , and a region which is 
less reactive but apparently more specific, since it does 
not produce any background with the control serum, 
5 represented by the octapeptides Leu-Phe-Ala-Phe-Glu-Asp- 
Pro-Leu (LFAFEDPL) (SEQ ID NO: 43) and Phe-Ala-Phe-Glu-Asp- 
Pro-Leu-Asn (FAFEDPLN) (SEQ ID NO:44), 

These regions make it possible to define new 
peptides which are more specific and more immunoreactive 

10 according to the usual techniques. 

It is thus possible, as a result of the 
discoveries made and the methods developed by the inven- 
tors, to carry out a diagnosis of MSRV-1 infection and/or 
reactivation and to evaluate a therapy in MS on the basis 

15 of its efficacy in "negativing" the detection of these 
agents in the patients' biological fluids. Furthermore, 
early detection in individuals not yet displaying neuro- 
logical signs of MS could make it possible to institute a 
treatment which would be all the more effective with 

20 respect to the subsequent clinical course for the fact 
that it would precede the lesion stage which corresponds 
to the onset of neurological disorders. Now, at the 
present time, a diagnosis of MS cannot be established 
before a symptomatology of neurological lesions has set 

25 in, and hence no treatment is instituted before the 
emergence of a clinical picture suggestive of lesions of 
the central nervous system which are already significant. 
The diagnosis of an MSRV-1 and/or MSRV-2 infection and/or 
reactivation in man is hence of decisive importance, and 

30 the present invention provides the means of doing this. 

It is thus possible, apart from carrying out a 
diagnosis of MSRV-1 infection and/or reactivation, to 
evaluate a therapy in MS on the basis of its efficacy in 
••negativing" the detection of these agents in the 

35 patients' biological fluids. 
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EXAMPLE 12: OBTAINING A CLONE LB19 CONTAINING A 
PORTION OF THE gag GENE OF THE MSRV-1 RETROVIRUS 

A PGR technique derived from the technique 
published by Gonzalez-Quintial R et al. (19) and PLAZA et 
al. (25) was used. From the total RNAs extracted from a 
fraction of virion purified as described above, the cDNA 
was synthesized using a specific primer (SEQ ID No. 64) at 
the 3* end of the genome to be amplified, using EXPAND™ 
REVERSE TRANSCRIPTASE (BOEHRINGER MANNHEIM) . 

cDNA: 

AAGGGGCATG GACGAGGTGG TGGCTTATTT (SEQ ID NO: 65) 
(antisense) 

15 After purification, a poly(G) tail was added at 

the 5' end of the cDNA using the "Terminal transferases 
kit" marketed by the company Boehringer Mannheim, 
according to the manufacturer's protocol. 

An anchoring PGR was carried out using the 

20 following 5' and 3' primers: 

AGATCTGCAG AATTCGATAT CACCCCCCCC CCCCCC (SEQ ID No. 91) 
(sense), and AAATGTCTGC GGCACCAATC TCCATGTT 

(SEQ ID No. 64) (antisense) 

Next, a semi-nested anchoring PGR was carried 

25 out with the following 5' and 3' primers: 

AGATCTGCAG AATTCGATAT CA (SEQ ID No. 92) (sense), and 

AAATGTCTGC GGCACCAATC TCCATGTT (SEQ ID No. 64) (antisense) 

The products originating from the PGR were 
purified after purification on agarose gel according to 

30 conventional methods (17), and then resuspended in 
10 microlitres of distilled water. Since one of the 
properties of Taq polymerase consists in adding an adenine 
at the 3' end of each of the two DNA strands, the DNA 
obtained was inserted directly into a plasmid using the TA 

35 Cloning™ kit (British Biotechnology) . The 2 m1 of DNA 
solution were mixed with 5 /xl of sterile distilled water, 
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1 jul of 10-fold concentrated ligation buffer "lOx LIGATION 
BUFFER", 2 Ml of "pCR™ VECTOR" (25 ng/ml) and 1 m1 of "T4 
DNA LIGASE" . This mixture was incubated overnight at 12°C. 
The following steps were carried out according to the 
5 instructions of the TA Cloning™ kit (British 
Biotechnology) . At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 
10 "miniprep" procedure (17), The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analysed on agarose gel, Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide were selected for 
15 sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning Kit™. The reaction prior 
to sequencing was then performed according to the method 
recommended for the use of the sequencing kit ''Prism ready 
20 reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref. 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 
25 PCR amplification according to the technique 

mentioned above was used on a cDNA synthesized from the 
nucleic acids of fractions of infective particles purified 
on a sucrose gradient, according to the technique 
described by H. Perron (13), from culture supernatants of 
30 B lymphocytes of a patient suffering from MS, immortalized 
with Epstein-Barr virus (EBV) strain B95 and expressing 
retroviral particles associated with reverse transcriptase 
activity as described by Perron et al. (3) and in French 
Patent Applications MS 10, 11 and 12. the clone LB19, 
3 5 whose sequence, identified by SEQ ID NO: 59, is presented 
in Figure 35. 
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The clone makes it possible to define, with the 
clone GM3 previously sequenced and the clone G+E+A (see 
Example 15) , a region of 690 base pairs representative of 
a significant portion of the gag gene of the MSRV-l 
5 retrovirus, as presented in Figure 36. This sequence 
designated SEQ ID NO: 88 is reconstituted from different 
clones overlapping at their ends. This sequence is 
identified under the name MSRV-l "gag*" region. In Figure 
36, a potential reading frame with the translation into 
10 amino acids is presented below the nucleic acid sequence. 

EXAMPLE 13: OBTAINING A CLONE FBdl3 CONTAINING A 
pel GENE REGION RELATED TO THE MSRV-l RETROVIRUS AND AN 
APPARENTLY INCOMPLETE ENV REGION CONTAINING A POTENTIAL 
15 READING FRAME (ORF) FOR A GLYCOPROTEIN 

Extraction of viral RNAs: The RNAs were 
extracted according to the method briefly described below. 

A pool of culture supernatant of B lymphocytes 
of patients suffering from MS (650 ml) is centrifuged for 
20 30 minutes at 10,000 g. The viral pellet obtained is 
resuspended in 300 microlitres of PBS/10 mM MgCl2. The 

material is treated with a DNAse (100 mg/ml)/RNAse 
(50 mg/ml) mixture for 30 minutes at 37°C and then with 
proteinase K (50 mg/ml) for 30 minutes at 46°C. 
25 The nucleic acids are extracted with one volume 

of a phenol/0.1% SDS (V/V) mixture heated to GO'^C, and 
then re-extracted with one volume of phenol/chloroform 
(l:l; V/V) . 

Precipitation of the material is performed with 
30 2.5 V of ethanol in the presence of 0.1 V of sodium 
acetate pH5.2, The pellet obtained after centr if ugation is 
resuspended in 50 microlitres of sterile DEPC water. 

The sample is treated again with 50 mg/ml of 
"RNAse free" DNAse for 3 0 minutes at room temperature, 
35 extracted with one volume of phenol/chloroform and 



BNSDOCID:<WO 9823755A1> 



wo 98/23755 




'CT/IB97/01482 



78 

precipitated in the presence of sodium acetate and 
ethanol . 

The RNA obtained is quantified by an OD reading 
at 260 nm. The presence of MSRV-1 and the absence of DNA 
5 contaminant is monitored by a PCR and an MSRV-l-specif ic 
RTPCR associated with a specific ELOSA for the MSRV-1 
genome . 

Synthesis of cDNA: 

5 mg of RNA are used to synthesize a cDNA primed 
10 with a poly(DT) oligonucleotide according to the 

instructions of the "cDNA Synthesis Module" kit (ref 

RPN 1256, Amersham) with a few modifications: The reverse 

transcription is performed at 45 °C instead of the 

recommended 4 2 °C , 
15 The synthesis product is purified by a double 

extraction and a double purification according to the 

manufacturer ' s instructions . 

The presence of MSRV-1 is verified by an MSRV-1 

PCR associated with a specific ELOSA for the MSRV-1 
20 genome. 

"Long Distance PCR": (LD-PCR) 

500 ng of cDNA are used for the LD-PCR step 
(Expand Long Template System; Boehringer (ref. 1681 842))- 

Several pairs of oligonucleotides were used, 
25 Among these, the pair defined by the following primers: 
5' primer: GGAGAAGAGC AGCATAAGTG G (SEQ ID NO: 66) 
3* primer: GTGCTGATTG GTGTATTTAC AATCC (SEQ ID NO:67)* 
The amplification conditions are as follows: 
94**C 10 seconds 
30 56°C30 seconds 

68 °C 5 minutes; 
10 cycles, then 20 cycles with an increment of 
2 0 seconds in each cycle on the elongation time. At the 
end of this first amplification, 2 microlitres of the 
35 amplification product are subjected to a second 
amplification under the same conditions as before* 
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The LD-PCR reactions are conducted in a Perkin 
model 9600 PGR apparatus in thin-walled microtubes 
(Boehringer) • 

The amplification products are monitored by 
5 electrophoresis of l/5th of the amplification volume 
(10 microlitres) in 1% agarose gel. For the pair of 
primers described above, a band of approximately 1.7 Kb is 
obtained. 

Cloning of the amplified fragment: 
10 The PGR product was purified by passage through 

a preparative agarose gel and then through a Gostar column 

(Spin; D. Dutcher) according to the supplier's 

instructions . 

2 microlitres of the purified solution are 
15 joined up with 50 ng of vector PGRII according to the 

supplier's instructions (TA Cloning Kit; British 

Biotechnology) ) . 

The recombinant vector obtained is isolated by 

transformation of competent DHSaF' bacteria. The bacteria 
20 are selected using their resistance to ampicillin and the 

loss of metabolism for Xgal (= white colonies) . The 

molecular structure of the recombinant vector is confirmed 

by plasmid minipreparation and hydrolysis with the enzyme 

EcoRl . 

25 FBdl3, a positive clone for all these criteria, 

was selected. A large-scale preparation of the recombinant 
plasmid was performed using the Midiprep Quiagen kit (ref 
12243) according to the supplier's instructions. 

Sequencing of the clone FBdl3 is performed by 

30 means of the Perkin Prism Ready Amplitaq FS dye terminator 
kit (ref. 402119) according to the manufacturer's 
instructiions. The sequence reactions are introduced into 
a Perkin type 377 or 373A automatic sequencer. The 
sequencing strategy consists in gene walking carried out 

35 on both strands of the clone Fbdl3. 
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The sequence of the clone FBdl3 is identified by 
SEQ ID NO: 58. 

In Figure 37, the sequence homology between the 
clone FBdl3 and the HSERV-9 retrovirus is shown on the 
5 matrix chart by a continuous line for any partial homology 
greater than or equal to 70%, It can be seen that there 
are homologies in the flanking regions of the clone (with 
the pol gene at the 5 • end and with the env gene and then 
the LTR at the 3* end), but that the internal region is 

10 totally divergent and does not display any homology, even 
weak, with the env gene of HSERV-9. Furthermore, it is 
apparent that the clone FBdl3 contains a longer "env" 
region than the one which is described for the defective 
endogenous HSERV-9; it may thus be seen that the internal 

15 divergent region constitutes an "insert" between the 
regions of partial homology with the HSERV-9 defective 
genes • 

This additional sequence determines a potential 
orf, designated ORF B13, which is represented by its amino 
20 acid sequence SEQ ID NO: 87, 

The molecular structure of the clone FBdl3 was 
analyzed using the GeneWork software and Genebank and 
SwissProt data banks. 

5 glycosylation sites were found. 
25 The protein does not have significant homology 

with already known sequences. 

It is probable that this clone originates from a 
recombination of an endogenous retroviral element (ERV) , 
linked to the replication of MSRV-1. 
30 Such a phenomenon does not lack generation of 

the expression of polypeptides, or even of endogenous 
retroviral proteins which are not necessarily tolerated by 
the immune system. Such a scheme of aberrant expression of 
endogenous elements related to MSRV-1 and/or induced by 
35 the latter is liable to multiply the aberrant antigens, 
and hence tends to contribute to the induction of 
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autoimmune processes such as are observed in MS. It 
clearly constitutes a novel element never hitherto 
described. In effect, interrogation of the data banks of 
nucleic acid sequences available in version No, 19 (1996) 
5 of the "Entrez" software (NCBI, NIH, Bethesda, USA) did 
not enable a known homologous sequence comprising the 
whole of the env region of this clone to be identified. 



EXAMPLE 14: OBTAINING A CLONE FP6 CONTAINING A 
10 PORTION OF THE pol GENE, WITH A REGION CODING FOR THE 
REVERSE TRANSCRIPTASE ENZYME HOMOLOGOUS TO THE CLONE POL* 
MSRV-1, AND A 3 'pol REGION DIVERGENT FROM THE EQUIVALENT 
SEQUENCES DESCRIBED IN THE CLONES POL*, tpol, FBd3 , JLBCl 
and JL6C2 

15 A 3 'RACE was performed on total RNA extracted 

from plasma of a patient suffering from MS. A healthy 
control plasma treated under the same conditions was used 
as negative control. The synthesis of cDNA was carried out 
with the following modified oligo(dT) primer: 
2 0 5' GACTCGCTGC AGATCGATTT tTTTTTTTTT TTTT 3' (SEQ ID NO: 68) 
and Boehringer "Expand RT" reverse transcriptase 
according to the conditions recommended by the company. A 
PCR was performed with the enzyme Klentaq (Clontech) under 
the following conditions: 94°C 5 min then 93°C 1 min, 58°C 
25 1 min, 68°C 3 min for 40 cycles and 68°C for 8 min, and 
with a final reaction volume of 50 ^1- 
Primers used for the PCR: 
- 5' primer, identified by SEQ ID NO: 69 
5 ' GCCATCAAGC CACCCAAGAA CTCTTAACTT 3 • ; 
30 - 3' primer, identified by SEQ ID NO: 68 (=the 

same as for the cDNA) 

A second, so-called "semi-nested" PCR was 
carried out with a 5 • primer located within the region 
already amplified. This second PCR was performed under the 
35 same experimental conditions as those used in the first 
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PGR, using 10 /xl of the amplification product originating 
from the first PGR. 

Primers used for the semi-nested PGR: 

- 5' primer, identified by SEQ ID NO: 70 
5 5 • CGAATAGGCA GACGATTATA TACAGTAATT 3 • ; 

- 3' primer, identified by SEQ ID NO: 68 (=the 
same as for the cDNa) 

Primers SEQ ID NO: 69 and SEQ ID NO: 70 are 
specific for the pol* region: position No. 403 to No* 422 

10 and No, 641 to No, 670, respectively. 

An amplification product was thus obtained from 
the extracellular RNA extracted from the plasma of a 
patient suffering from MS. The corresponding fragment was 
not observed for the plasma of the healthy control. This 

15 amplification product was cloned in the following manner. 

The amplified DNA was inserted into a plasmid 
using the TA Gloning^M )^it. The 2 /xl of DNA solution were 
mixed with 5 ijlI of sterile distilled water, 1 m1 of a 
10-fold concentrated ligation buffer "lOx LIGATION 

20 BUFFER", 2 ^1 of "pGR™ VECTOR" (25 ng/ml) and 1 /xl of 
"TA DNA LIGASE" . This mixture was incubated overnight at 
12*^0. The following steps were carried out according to 
the instructions of the TA CloningTM kit (British 
Biotechnology) . At the end of the procedure, the white 

25 columns of recombinant bacteria (white) were picked out in 
order to be cultured and to permit extraction of the 
plasmids incorporated according to the so-called 
"miniprep" procedure (17) . The plasmid preparation from 
each recombinant colony was cut with a suitable 

30 restriction enzyme and analyzed on agarose gel. Plasmids 
possessing an insert detected under UV light after 
staining the gel with ethidium bromide was selected for 
sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 

35 cloning plasmid of the TA cloning kit™. The reaction prior 
to sequencing was then performed according to the method 
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recoininended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 
(Applied Biosystems, ref, 401384), and automatic 
sequencing was carried out with an Applied Biosystems 
5 "Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 

The clone obtained, designated FP6, enables a 
region of 467 bp which is 89% homologous to the pol* 
region of the MSRV-1 retrovirus and a region of 1167 bp 

10 which is 64% homologous to the pol region of ERV-9 
(No. 1634 to 2856) to be defined. 

The clone FP6 is represented in Figure 38 by its 
nucleotide sequence identified by SEQ ID NO: 61. The three 
potential reading frames of this clone are indicated by 

15 their amino acid sequence under the nucleotide sequence, 

EXAMPLE 15: OBTAINING A REGION DESIGNATED G+E+A 
CONTAINING AN ORF FOR A RETROVIRAL PROTEASE, BY PGR 
AMPLIFICATION OF THE NUCLEIC ACID SEQUENCE CONTAINED 
20 BETWEEN THE 5' REGION DEFINED BY THE CLONE "GM3" AND THE 
3« REGION DEFINED BY THE CLONE POL*, FROM THE RNA 
EXTRACTED FROM A POOL OF PLASMAS OF PATIENTS SUFFERING 
FROM MS 

Oligonucleotides specific for the MSRV-1 
25 sequences already identified by the Applicant were defined 
in order to amplify the retroviral RNA originating from 
virions present in the plasma of patients suffering from 
MS. Control reactions were performed so as to monitor the 
presence of contaminants (reaction with water) . The 
30 amplification consists of a step of RT-PCR followed by a 
"nested" PGR. Pairs of primers were defined for amplifying 
three overlapping regions (designated G, E and A) on the 
regions defined by the sequences of the clones GM3 and 
pol* described above. 

35 

Semi-nested RT-PCR for amplification of the region G: 
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- in the first RT-PCR cycle, the following 
primers are used: 

primer 1: SEQ ID NO:71 (sense) 
primer 2: SEQ ID NO: 72 (antisense) 
5 - in the second PGR cycle, the following primers 

are used: 

primer 1: SEQ ID NO: 73 (sense) 
primer 4: SEQ ID NO: 74 (antisense) 

Nested RT-PCR for amplification of the region E: 
10 - in the first RT-PCR cycle, the following 

primers are used: 

primer 5: SEQ ID NO: 75 (sense) 
primer 6: SEQ ID NO: 76 (antisense) 

- in the second PCR cycle, the following primers 

15 are used: 

primer 7: SEQ ID NO: 77 (sense) 
primer 8: SEQ ID NO: 78 (antisense) 
Semi-nested RT-PCR for amplification of the region A: 

- in the first RT-PCR cycle, the following 
20 primers are used: 

primer 9: SEQ ID NO: 79 (sense) 
primer 10: SEQ ID NO: 80 (antisense) 

- in the second PCR cycle, the following primers 

are used: 

25 primer 9: SEQ ID NO: 81 (sense) 

primer 11: SEQ ID NO: 82 (antisense) 

The primers and the regions G, E and A which 
they define are positioned as follows: 
cDNA 



30 1 G 4 2 

5 7 E 8 6 

3 A 11 10 

< X > 

GM3 POL* 
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The sequence of the region defined by the 
different clones G, E and A was determined after cloning 
and sequencing of the "nested" amplification products. 

The clones G, E and A were assembled together by 
5 PGR with the primers 1 at the 5' end of the fragment G and 
11 at the 3» end of the fragment A, the primers being 
described above. An approximately 158 0-bp fragment G+E+A 
was amplified and inserted into a plasmid using the TA 
Cloning (trademark) kit. The sequence of the amplification 
10 product corresponding to G+E+A was determined and analysis 
of the G+E and E+A overlaps was carried out. The sequence 
is shown in Figure 39, and corresponds to the sequence SEQ 
ID NO: 89. 

A reading frame coding for an MSRV-l retroviral 
15 protease was found in the region E. The amino acid 
sequence of the protease, identified by SEQ ID NO: 90, is 
presented in Figure 40. 

EXAMPLE 16: OBTAINING A CLONE LTRGAG12 ^ RELATED 
20 TO AN ENDOGENOUS RETROVIRAL ELEMENT (ERV) CLOSE TO MSRV-1^ 
IN THE DNA OF AN MS LYMPHOBLASTOID LINE PRODUCING VIRIONS 
AND EXPRESSING THE MSRV-1 RETROVIRUS 

A nested PGR was performed on the DNA extracted 
from a lymphoblastoid line (B lymphocytes immortalized 
25 with the EBV virus strain B95, as described above and as 
is well known to a person skilled in the art) expressing 
the MSRV-1 retrovirus and originating from peripheral 
blood lymphocytes of a patient suffering from MS. 

In the first PGR step, the following primers are 

30 used: 

primer 4 327: CTCGATTTCT TGCTGGGCCT TA (SEQ ID NO: 83) 
primer 3 512: GTTGATTCCC TCCTCAAGCA (SEQ ID NO: 84) 

This step comprises 3 5 amplification cycles with 
the following conditions: 1 min at 94*>C, 1 min at 54^0 and 
35 4 min at 72°C. 
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In the second PGR step, the following primers 

are used: 

primer 4 2 94: CTCTACCAAT CAGCATGTGG (SEQ ID NO: 85) 
primer 3 591: TGTTCCTCTT GGTCCCTAT (SEQ ID NO: 86) 

5 This step comprises 35 amplification cycles with 

the following conditions: 1 min at 94*C, 1 min at 54°C and 

4 min at 7 2<>C. 

The products originating from the PGR were 
purified after purification on agarose gel according to 

10 conventional methods (17), and then resuspended in 10 ml 
of distilled water. Since one of the properties of Taq 
polymerase consists in adding an adenine at the 3 ' end of 
each of the two DNA strands, the DNA obtained was inserted 
directly into a plasmid using the TA Cloning™ kit (British 

15 Biotechnology) • The 2 Ml of DNA solution were mixed with 

5 Ml of sterile distilled water, 1 fil of a 10-fold 
concentrated ligation buffer "lOx LIGATION BUFFER", 2 m1 
of "pCR™ VEGTOR" (25 ng/ml) and 1 Ml of "TA DNA LIGASE" . 
This mixture was incubated overnight at 12*^0. The 

20 following steps were carried out according to the 
instructions of the TA Gloning™ kit (British 
Biotechnology) . At the end of the procedure, the white 
colonies of recombinant bacteria (white) were picked out 
in order to be cultured and to permit extraction of the 

25 plasmids incorporated according to the so-called 
"miniprep" procedure (17). The plasmid preparation from 
each recombinant colony was cut with a suitable 
restriction enzyme and analyzed on agarose gel. The 
plasmids possessing an insert detected under UV light 

30 after staining the gel with ethidium bromide were selected 
for sequencing of the insert, after hybridization with a 
primer complementary to the Sp6 promoter present on the 
cloning plasmid of the TA Cloning Kit™. The reaction prior 
to sequencing was then performed according to the method 

3 5 recommended for the use of the sequencing kit "Prism ready 
reaction kit dye deoxyterminator cycle sequencing kit" 
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(Applied Biosystems, ref • 401384) , and automatic 
sequencing was carried out with an Applied Biosystems 
"Automatic Sequencer, model 373 A" apparatus according to 
the manufacturer's instructions. 
5 Thus, a clone designated LTRGAG12 could be 

obtained, and is represented by its internal sequence 
identified by SEQ ID NO: 60. 

This clone is probably representative of 
endogenous elements close to ERV-9 , present in human DNA, 

10 in particular in the DNA of patients suffering from MS, 
and capable of interfering with the expression of the 
MSRV-1 retrovirus, hence capable of having a role in the 
pathogenesis associated with the MSRV-1 retrovirus and 
capable of serving as marker for a specific expression in 

15 the pathology in question, 

EXAMPLE 17: DETECTION OF ANTI-MSRV-1 SPECIFIC 
ANTIBODIES IN HUMAN SERUM 

Identification of the sequence of the pol gene 
20 of the MSRV-1 retrovirus and of an open reading frame of 
this gene enabled the amino acid sequence SEQ ID NO: 63 of 
a region of the said gene, referenced SEQ ID NO: 62, to be 
determined . 

Different synthetic peptides corresponding to 
25 fragments of the protein sequence of MSRV-1 reverse 
transcriptase encoded by the pol gene were tested for 
their antigenic specificity with respect to sera of 
patients suffering from MS and of healthy controls. 

The peptides were synthesized chemically by 
30 solid-phase synthesis according to the Merrifield tech- 
nique (22) . The practical details are those described 
below. 

a) Peptide synthesis: 

The peptides were synthesized on a phenylacet- 
35 amidomethyl (PAM) /polystyrene/divinylbenzene resin 

(Applied Biosystems, Inc. Foster City, CA) , using an 
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"Applied Biosystems 43 OA" automatic synthesizer • The amino 
acids are coupled in the form of hydroxybenzotriazole 
(HOBT) esters. The amino acids used are obtained from 
Novabiochem (Lauf lerlf ingen, Switzerland) or Bachem 
5 (Bubendorf , Switzerland) . 

The chemical synthesis was performed using a 
double coupling protocol with N-methylpyrrolidone (NMP) as 
solvent. The peptides were cut from the resin, as well as 
the side-chain protective groups, simultaneously, using 

10 hydrofluoric acid (HF) in a suitable apparatus (type I 
cleavage apparatus, Peptide Instiute, Osaka, Japan) . 

For 1 g of peptidyl resin, 10 ml of HF, 1 ml of 
anisole and 1 ml of dimethyl sulphide 5DMS are used. The 
mixture is stirred for 45 minutes at -2°C. The HF is then 

15 evaporated off under vacuum. After intensive washes with 
ether, the peptide is eluted from the resin with 10% 
acetic acid and then lyophilized. 

The peptides are purified by preparative high 
performance liquid chromatography on a VYDAC C18 type 

20 column (250 x 21 mm) (The Separation Group, Hesperia, CA, 
USA) . Elution. is carried out with an acetonitrile gradient 
at a flow rate of 22 ml/min. The fractions collected are 
monitored by an elution under isocratic conditions on a 
VYDAC™ CIS analytical column (250 x 4.6 mm) at a flow rate 

25 of 1 ml/min. Fractions having the same retention time are 
pooled and lyophilized. The preponderant fraction is then 
analysed by analytical high performance liquid 
chromatography with the system described above. The 
peptide which is considered to be of acceptable purity 

30 manifests itself in a single peak representing not less 
than 95% of the chromatogram. 

The purified peptides are then analysed with the 
object of monitoring their amino acid composition, using 
an Applied Biosystems 420H automatic amino acid analyser, 

35 Measurement of the (average) chemical molecular mass of 
the peptides is obtained using LSIMS mass spectrometry in 



BNSDOCID:<WO 9823755A1> 



■4 ■ 



wo 98/23755 PCT/IB97/01482 



the positive ion mode on a VG. ZAB.ZSEQ double focusing 
instrument connected to a DEC-VAX 2000 acquisition system 
(VG analytical Ltd, Manchester, England) . 

The reactivity of the different peptides was 
5 tested against sera of patients suffering from MS and 
against sera of healthy controls. This enabled a peptide 
designated S24Q to be selected, whose sequence is 
identified by SEQ ID NO: 63, encoded by a nucleotide 
sequence of the pol gene of MSRV-1 (SEQ ID NO:62). 

10 

b) Antigenic properties: 

The antigenic properties of the S24Q peptide 
were demonstrated according to the ELISA protocol 
described below. 

15 The lyophilized S24Q peptide was dissolved in 

10 % acetic acid at a concentration of 1 mg/ml . This stock 
solution was aliquoted and kept at +4°C for use over a 
fortnight, or frozen at -20°C for use within 2 months. An 
aliquot is diluted in PBS (phosphate buffered saline) 

20 solution so as to obtain a final peptide concentration of 
5 micrograms/ml . 100 microlitres of this dilution are 
placed in each well of Nunc Maxisorb (trade name) 
microtitration plates. The plates are covered with a 
"plate-sealer'« type adhesive and kept for 2 hours at +37<>C 

25 for the phase of adsorption of the peptide to the plastic. 
The adhesive is removed and the plates are washed three 
times with a volume of 300 microlitres of a solution A 
(IX' PBS, 0.05% Tween 20®), then inverted over an 
absorbent tissue. The plates thus drained are filled with 

30 250 microlitres per well of a solution B (solution A + 10% 
of goat serum) , then covered with an adhesive and 
incubated for 1 hour at 37 °C. The plates are then washed 
three times with the solution A as described above. 

The test serum samples are diluted beforehand to 

35 1/100 in the solution B, and 100 microlitres of each 
dilute test serum are placed in the wells of each micro- 
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titration plate. A negative control is placed in one well 
of each plate, in the form of 100 microlitres of buffer B. 
The plates covered with an adhesive are then incubated for 
1 hour 30 min at 37 °C. The plates are then washed three 
5 times with the solution A as described above. For the IgG 
response, a peroxidase-labelled goat antibody directed 
against human IgG (marketed by Jackson Immune Research 
IncO is diluted in the solution B (dilution . 1/ 10 , 000) • 
100 microlitres of the appropriate dilution of the 

10 labelled antibody are then placed in each well of the 
microtitration plates, and the plates covered with an 
adhesive are incubated for 1 hour at 3 7°C, A further 
washing of the plates is then performed as described 
above. In parallel, the peroxidase substrate is prepared 

15 according to the directions of the bioMerieux kits. 100 
microlitres of substrate solution are placed in each well, 
and the plates are placed protected from light for 20 to 
3 0 minutes at room temperature. 

When the colour reaction has stabilized, 

20 50 microlitres of Color 2 (bioMerieux trade name) are 
placed in each well in order to stop the reaction. The 
plates are placed immediately in an ELISA plate 
spectrophotometric reader, and the optical density (OD) of 
each well is read at a wavelength of 492 nm. 

25 The serological samples are introduced in dupli- 

cate or in triplicate, and the optical density (OD) 
corresponding to the serum tested is calculated by taking 
the mean of the OD values obtained for the same sample at 
the same dilution. 

30 The net OD of each serum corresponds to the mean 

OD of the serum minus the mean OD of the negative control 
(solution B: PBS, 0.05% Tween 20x, 10% goat serum). 

c) Detection of anti-MSRV-1 IgG antibodies 
(S24Q) by ELISA: 

35 The technigue described above was used with the 

S24Q peptide to test for the presence of anti-MSRV-1 
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specific IgG antibodies in the serum of 15 patients for 
whom a definite diagnosis of MS was established according 
to the criteria of Poser (23), and of 15 healthy controls 
(blood donors) . 

5 Figure 41 shows the results for each serum 

tested with an anti-IgG antibody. Each vertical bar 
represents the net optical density (OD at 492 nm) of a 
serum tested. The ordinate axis gives the net OD at the 
top of the vertical bars. The first 15 vertical bars lying 

10 to the left of the vertical broken line represent the sera 
of 15 healthy controls (blood donors) , and the 15 vertical 
bars lying to the right of the vertical broken line 
represent the sera of 15 cases of MS tested. The diagram 
enables 2 controls to be revealed whose OD rises above the 

15 grouped values of the control population. These values may 
represent the presence of specific IgGs in symptomless 
seropositive patients. Two methods were hence evaluated in 
order to determine the statistical threshold of positivity 
of the test. 

20 The mean of the net OD values for the controls, 

including the controls with high net OD values, is 0.129 
and the standard deviation is 0.06. Without the 2 controls 
whose OD values are greater than 0.2, the mean of the 
"negative" controls is 0.107 and the standard deviation is 

25 0.03. A theoretical threshold of positivity may be 
calculated according to the formula: 

threshold value (mean of the net OD values of the 
negative controls) + ( 2 or 3 ' standard deviation 
30 of the net OD values of the negative controls) . 

In the first case, there are considered to be 
symptomless seropositives, and the threshold value is 
equal to 0.11 + (3 x 0.03) = 0.20. The negative results 
35 represent a non-specific "background" of the presence of 
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antibodies directed specifically against an epitope of the 
peptide . 

In the second case, if the set of controls 
consisting of blood donors in apparent good health is 
5 taken as a reference basis, without excluding the sera 
which are, on the face of it, seropositive, the standard 
deviation of the "non-MS controls" is 0.116. The threshold 
value then becomes 0.13 + (3 x 0.06) = 0.31. 

According to this latter analysis, the test is 

10 specific for MS. In this respect, it is seen that the test 
is specific for MS, since, as shown in Table 1, no control 
has a net OD above this threshold. In fact, this result 
reflects the fact that the antibody titres in patients 
suffering from MS are, for the most part, higher than in 

15 healthy controls who have been in contact with MSRV-1. 

In accordance with the first method of calcula- 
tion, and as shown in Figure 41 and in Table 3, 6 of the 
15 MS sera give a positive result (OD greater than or 
equal to 0.2), indicating the presence of IgGs 

20 specifically directed against the S24Q peptide, hence 
against a portion of the reverse transcriptase enzyme of 
the MSRV-1 retrovirus encoded by its pol gene, and 
consequently against the MSRV-1 retrovirus. 

Thus, approximately 40% of the MS patients 

25 tested have reacted against an epitope carried by the S24Q 
peptide and possess circulating IgGs directed against the 
latter. 

Two out of 15 blood donors in apparent good 
health show a positive result. Thus, it is apparent that 

30 approximately 13% of the symptomless population may have 
been in contact with an epitope carried by the S24Q 
peptide under conditions which have led to an active 
immunization which manifests itself in the persistence of 
specific serum IgGs. These conditions are compatible with 

35 an immunization against the MSRV-l retrovirus reverse 
transcriptase during an infection with (and/or reactiva- 
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tion of) the MSRV-1 retrovirus. The absence of apparent 
neurological pathology recalling MS in these seropositive 
controls may indicate that they are healthy carriers and 
have eliminated an infectious virus after immunizing 
5 themselves, or that they constitute an at-risk population 
of chronic carriers. In effect, epidemiological data 
showing that a pathogenic agent present in the environment 
of regions of high prevalence of MS may be the cause of 
this disease imply that a fraction of the population free 

10 from MS has necessarily been in contact with such a 
pathogenic agent. It has been shown that the MSRV-1 
retrovirus constitutes all or part of this "pathogenic 
agent" at the source of MS, and it is hence normal for 
controls taken from a healthy population to possess IgG 

15 type antibodies against components of the MSRV-1 
retrovirus . 

Lastly, the detection of anti-S24Q antibodies in 
only one out of two MS cases tested here may reflect the 
fact that this peptide does not represent an 

20 immunodominant MSRV-1 epitope, that inter-individual 
strain variations may induce an immunization against a 
divergent peptide motif in the same region, or that the 
course of the disease and the treatments followed may 
modulate over time the antibody response against the S24Q 

25 peptide. 



30 



35 
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d) Detection of anti-MSRV-1 IgM antibodies by 

ELISA: 

20 The ELISA technique with the S24Q peptide was 

used to test for the presence of anti-MSRV-1 IgM specific 
antibodies in the same sera as above. 

Figure 42 shows the results for each serum tested 
with an anti-IgM antibody. Each vertical bar represents 

25 the net optical density (OD at 492 nm) of a serum tested. 
The ordinate axis gives the net OD at the top of the 
vertical bars. The first 15 vertical bars lying to the 
left of the vertical line cutting the abscissa axis 
represent the sera of 15 healthy controls (blood donors) , 

30 and the vertical bars lying to the right of the vertical 
broken line represent the sera of 15 cases of MS tested. 

The mean of the OD values for the MS cases 

tested is 1.6. 

The mean of the net OD values for the controls 

35 is 0.7. 
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The standard deviation of the negative controls 

is 0.6, 

The threshold of theoretical positivity may be 
calculated according to the formula: 

5 

threshold value = (mean of the OD values of the negative 

controls) + (3 x standard deviation of 
the OD values of the negative controls) 

10 The threshold value is hence equal to 0.7 + (3 x 0.6) = 
2.5; 

The negative results represent a non-specific 
"background" of the presence of antibodies directed 
specifically against an epitope of the peptide, 
15 According to this analysis, and as shown in 

Figure 42 and in the corresponding Table 4, the IgM test 
is specific for MS, since no control has a net OD above 
the threshold. 6 of the 15 MS sera produce a positive IgM 
result 

20 The difference in seroprevalence between the MS 

and control populations is extremely significant: 
"chi-squared" test, p < 0.002. 

These results point to an aetiopathogenic role 

of MSRV-1 in MS. 
25 Thus, the detection of IgM and IgG antibodies 

against the S24Q peptide makes it possible to evaluate, 
alone or in combination with other MSRV-1 peptides, the 
course of an MSRV-1 infection and/or of the viral 
reactivation of MSRV-1. 
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It is possible, as a result of the new 
discoveries made and the new methods developed by the 
inventors, to permit the improved implementation of 
diagnostic tests for MSRV-1 infection and/ or reactivation 
and to evaluate a therapy in MS and/or RA on the basis of 
its efficacy in "negativing" the detection of these agents 
in the patient's biological fluids. Furthermore, early 
detection in individuals not yet displaying neurological 
signs of MS or rheumatological signs of RA could make it 
possible to institute a treatment which would be all the 
more effective with respect to the subsequent clinical 
course for the fact that it would precede the lesion stage 
which corresponds to the onset of the clinical disorders. 
Now, at the present time, a diagnosis of MS or RA cannot 
be established before a symptomatology of lesions has set 
in, and hence no treatment is instituted before the 
emergence of a clinical picture suggestive of lesions 
which are already significant. The diagnosis of an MSRV-1 
and/or MSRV-2 infection and/or reactivation in man is 
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hence of decisive importance, and the present invention 
provides the means of doing this. 

It is thus possible, apart from carrying out a 
diagnosis of MSRV-1 infection and/or reactivation, to 
5 evaluate a therapy in MS on the basis of its efficacy in 
"negativing" the detection of these agents in the 
patients' biological fluids. 

EXAMPLE 18 : 

10 1) MATERIALS AND METHODS 

- Patients and clinical samples 

Choroid plexus cells from MS patients and 
controls were obtained from the brain-cell library, 
Laboratoire R. Escourolles, Hopital de la Salpetriere, 

15 Paris, France. Non-tumoral leptomeningeal cells from 
controls were obtained as previously described (26) . 
Peripheral blood from MS and control patients used for 
obtaining B-cell lines and plasma, were obtained from the 
Neurological Departments, CHU de Grenoble, and from 

20 INSERM U 134, Hopital de la Salpetriere, France. Clinical 
details and origin of the 10 MS patients and of the 10 
patients with other neurological diseases who provided CSF 
samples are given in Table 6. 

- Cell cultures, virus isolation and purification 

25 All cell-types were cultured as previously 

described (3, 5, 26), 

All cultures were regularly screened for mycoplasma 
contamination with an ELISA mycoplasma-detection kit 
(Boehringer) . No cell-extract nor supernatant used 

30 contained detectable mycoplasma. 

Extracellular virion purification and sucrose density 
gradients were performed as previously described (3, 5, 
26). From each sucrose gradient o.5-lml fractions were 
collected from the top of the tubes, with a lOOO/xl 

35 Pipetman and a different sterile tip for each fraction. 
60/il were used for RT activity assay and the rest was 
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mixed with 1 volume of buffer containing 4M guanidinium 
thiocyanate, 0,5% N-Lauroyl sarcosin, 25mM EDTA, 0,2% 6- 
mercaptoethanol adjusted at pH 5.5 with acetic acid. These 
mixtures were frozen at -80 °C for futher RNA extraction 
5 or directly processed according to Chomzynski (20) , with 
an overnight precipitation step at -20*'C, in presence of 
RNase-free glycogen (Boehringer) . RNA was dissolved 20 to 
50/il of DEPC-treated water in the presence of 1-2^1 of 
recombinant RNase-inhibitor (PROMEGA) and 0 , ImM DTT. 10^1 

10 aliquots were used for each RT-PCR. 
- Reverse transcriptase activity 

RT-activity was tested with 20mM Mg"^"^ and poly- 
Cm or polyC templates, in virion pellets or fractions from 
sucrose gradients as previously described (3, 5, 26) • 

15 - cDNA synthesis and 'Pan-retro' RT-PCR with degenerate 
primers 

A total RT-activity between 10^-10*7 dpm was 
required in the fraction containing the peak of purified 
virions. The "Pan-retro" RT-PCR technique (27) was 

20 performed on virion RNA extracted by the method of 
Chomczynski (20) and dissolved in 20 iil RNase-free water, 
5 fjLl RNA solution was incubated for 30 min at 37 ^'C with 
0,3 units (3 units for CSF series) of RNase-free DNase-1 
(Boehringer) in a 20 /xl reaction containing 7.5 mM random 

25 hexamers, 5 mM Hepes-HCl pH 6 . 9 , 75 mM KCl, 3 mM MgCl2/ 10 
mM DTT, 50 mM Tris-HCl pH 7.5, 0.5 mM each dNTP, and 20 
units recombinant RNase inhibitor (Promega) . The DNase was 
then heat inactivated at SO^'C for 10 min. 20 units MoMLV 
RT (Pharmacia) and a further 20 units of RNase inhibitor 

XM 

3 0 were added to each tube in a Genesphere enclosure 
(Safetech, Ireland) and cDNA was synthesised for 90 min at 
37**C. Following reverse transcription, the cDNA was boiled 
for 5 min then cooled rapidly on ice. The Round 1 PCR mix 
(final volume 25 ^1 per reaction; 20 mM Tris-HCl pH 8.4, 

35 60 mM KCl, 2 . 5 mM MgCl2/ 200 ng each of primers PAN-UO and 
PAN-DI [see Figure 44], 0.2 mM each dNTP) was treated with 
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0.3 units DNase-1 and then heat inactivated as above. 

TM 

2.5 Ml cDNA was added in the Genesphere enclosure and the 
tubes heated to 80 °C before adding 0.5 units Tag 
polymerase (Perkin Elmer) individually to each tube ("hot 
5 start") . Round 1 PGR parameters were 35 cycles of 95^G for 
1 min, 34*^0 for 30 sec, 72'*G for 1 min, with a final 7 min 
extension at 72*=»C. 0.5 /il of Round 1 PGR product was 
transferred to the Round 2 DNase-treated PGR mix 
(composition as for Round 1 but containing primers PAN-UI 
10 and PAN-DI) using the "hot start" procedure. Round 2 PGR 
parameters were as for Round 1 but using 30 cycles only 
and annealing at 45*'C for 1 min. 

- Cloning of PGR products 

PGR products were cloned using the TA-cloning® 
15 kit (British Biotechnology) according to the 

manufacturer ' s recommendations . 

- Sequencing 

Sequencing reactions were performed using the 
"Prism ready reaction kit dye deoxyterminator cycle 
20 sequencing kit" (Applied Biosystems) . Automatic sequence 
analysis was performed on an automatic sequencer (Applied 
Biosystems, 373 A), 

- RT-PGR with STl primer sets 

The first PGR round was performed directly from the 
25 cDNA reaction mixture according to the one-step RT-PGR 
technique described by Mallet et al. (28). This one-step 
RT-PGR procedure reduced the probability of airborne 
contamination when opening the tubes and transferring PGR 
reagents after an independent cDNA synthesis, RNA was 
30 extracted as previously from 2ml of plasma (snap-frozen in 
liquid nitrogen and stored at -80°G) or from a 500 /il 
sucrose fraction with a total RT-activity above 10^ dpm, 
and resuspended in 50 /jlI of RNase-free water. For each RT- 
PGR reaction 10^1 of RNA solution was incubated in a 
35 Perkin-Elmer 480 thermocycler , 15 min at 20°G with lU of 
RNase-free DNASE 1 and 1.2 m1 of lOX DNASE buffer (50mM 
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Tris, lOinM MgC12 and 0 , ImM DTT) containing lU//xl of RNase- 
inhibitor (PROMEGA) , and heated at lO^'C for 10 min for 
DNase inactivation . The solution was placed on ice and 
mixed (in conditions preventing airborne dust/DNA 
5 contamination) with 88 m1 of PGR mix containing: IX tag 
buffer, 25 nM/tube dNTPs, 40pM/tube of each first round 
primer (STl.l upstream primer: 

5' AGGAGTAAGGAAACCCAACGGAC 3' (SEQ ID NO:99); STl.l 
downstream primer : 5 • TAAGAGTTGCACAAGTGCG 3 ' ( SEQ ID 

10 NO:100)), 2.5U/tube of tag (Appligene) and lOU/tube of 
AMV-RT (Boehringer) . Each tube iwas further incubated in a 
Perkin-Elmer 480 thermocycler for 10 min at 65°C, followed 
by 2h at 42 '"C for cDNA synthesis and 5 min at 95**C for 
inactivation of AMV-RT and DNA denaturat ion . First round 

15 parameters were 40 cycles of 95°C for 1 min, 53°C for 2.5 
min, 72^C for 1 min, with a final extension of 10 min at 
72**C, lO/il of the first round were transferred to the 
second round PGR mix previously treated at 20°C for 15 min 
with RNase-free DNase 1 (0.02U/m1) followed by DNase 

20 inactivation at 70*»C for 10 min. This mix contained IX tag 
buffer, 25 nM/tube dNTPs, 40pM/tube of each second round 
primers [ST1.2 upstream primer: 5 'TCAGGGATAGCCCCCATCTATS • 
(SEQ ID NO:101); ST1.2 downstream primer: 

5 ' AACCCTTTGCCACTACATCAATTT3 • (SEQ ID NO: 102)] and 

25 2.5U/tube of tag (Appligene). Second round parameters 
were 30 cycles of 95*»C for 1 min, 53*=»C for 1.5 min, 72*^0 
for 1 min, with a final extension of 8 min at 72°C. 20^1 
of this nested RT-PCR product were deposited on a 0,7% 
agarose gel containing ethidium bromide and exposed to UV 

30 light for the visualization of amplified products. 

- Hybridisation analysis of PGR products: MSRV-pol 
detection by ELOSA 

The protocol was essentially as previously 
described (21) but with the following modifications: Nunc 

35 Maxisorb microtitre plates were coated with 100 ng per 
well capture probe CpVlb (see Figure 44) either by passive 
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adsorption (21) or alternatively by using streptavidin 
coated plates and biotinylated CpVlb, Peroxidase-labelled 
detector probe DpVl (see Figure 44) was used and the assay 
cut-off was defined as the mean of 4 negative controls 
5 plus 0.2 OD492 units. 

- RNA extraction, cDNA synthesis and PGR amplification 
from MS plasma samples : 

Total RNA was extracted from human MS plasma by 
a guanidium method as described elsewhere (29) . Total RNA 

10 extracted from 100 ul of plasma, were treated with RNase- 
free DNase I (O.lU/jLil; Boehringer Manheim, France) and 
reverse transcribed under the conditions recommended by 
the manufacturer, using Superscript reverse transcriptase 
(Gibco-BRL, FRANCE) . The resulting cDNAs were amplified by 

15 semi-nested PGR through 35 cycles (94 °C 1 min, 55°G 1 mn, 
72°C 1 min 30 sec) and 72°C 8 min for a final extension. 
Three different fragments in the RT region were amplified 
by the following specific primers : 

- in the protease (PRT) region, for the 1st and 
20 2nd round of PGR, respectively, sense primer 

[5' TGG AGG AGC AGG ACT GAG GGT 3* (SEQ ID NO: 103)] and 
antisense primers [5' GTG TGG GTT GGG TTT GGT TAG TGG T 3' 
(SEQ ID NO: 104) / 5* GAG AGG AAA TGG GTA TTG CTT TGG 3' 
(SEQ ID NO: 105) ] 

25 - in the fragment A of the RT region (Gf. Fig 

46) , for the 1st and 2nd round of PGR, respectively, sense 
primer [5* AGG AGT AAG GAA AGC CAA GGG AGA G 3' (SEQ ID 
N0:106)] and antisense primers [5' TGT ATA TAA TGG TGT GGG 
TAT TGG G 3' (SEQ ID NO: 107) / 5' TTG GGG AGA AAC GTG TTA 

30 TGG CAA GG 3* (SEQ ID NO: 108)] 

- in the fragment B of the RT region (Gf. Fig. 
46) , for the 1st and 2nd round of PGR, respectively, sense 
primers [5* GGG TGT GGT GAG AGG AGA TTA GAT AG 3* (SEQ ID 
NO: 109) / 5* AAA GGG AGG AGG GCC GTG AGT GAG GA 3' (SEQ ID 

35 NO: 110)] and antisense primer 3' [5' GGT TTA AGA GTT GCA 
GAA GTG GGG AGT C 3' (SEQ ID NO: 101)]- 
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The amplified fragments were analysed on 
ethidium bromide-stained agarose gels, cloned in TA 
cloning vector (Invitrogen) and sequenced. 
2 ) RESULTS 

5 - Specific retroviral RNA is found in extracellular 
virions from MS patient-derived cell cultures and in MS 
patients' CSF. 

Choroid plexus cells (4) (obtained post-mortem) 
and EBV-immortalized peripheral blood B-lymphocytes (30, 

10 31) from MS patients gave rise to cultures expressing 100- 
120 nm viral particles associated with RT-activity similar 
to that of the original LM7 isolate (3) . Similar cell- 
types from non-MS donors produced neither this RT-activity 
nor virions. All the 'infected' cultures were poorly 

15 and/or transiently productive and/or had a limited 
lifespan. Therefore, in order to analyse the genomic RNA 
present in the very limited quantity of extracellular 
virions, we used an RT-PCR approach to amplify, with 
degenerate primers, a conserved region of the pol gene 

20 present in all known retroviruses (12); the techniques 
based on this approach will be called "Pan-retro" RT-PCR. 
Extensive DNAse treatment of samples and reagents was 
essential, because human DNA contains many endogenous 
retroviral elements amplifiable by this technique. 

2 5 "Pan-retro" RT-PCR experiments were performed on sucrose- 

density gradient purified virions from supernatants of 
different types of cell cultures and their non-infected 
controls: (i) choroid plexus cells sampled post-mortem 
from MS brain (PLI-1) , (ii) choroid plexus cells from non- 
30 MS brain autopsy, infected by co-culture with irradiated 
LM7 cells (LM7P) , and (iii) identical non-infected 
choroid-plexus cells. "Early" B-cell lines obtained by 
spontaneous in vitro transformation of two EBV- 
seropositive individuals, (iv) one MS patient and (v) one 

3 5 non-MS control, were also analysed. Figure 4 3 illustrates 

the RT-activity in sucrose-gradient fractions obtained 
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from the B-cell cultures. The technique described by Shih 
et al. (12) was modified in a semi-nested RT-PCR protocol 
(27) using degenerate primers (Fig. 2) and extensive DNase 
treatment. PGR amplifications were performed in London 
5 (Dpt of Virology, U.C.L.M.S.) on coded aliquots of the 
density gradient fractions. Blind and systematic cloning 
and sequencing of the PGR products were undertaken in an 
independent laboratory (bioMerieux, Lyon) . After complete 
sequencing of 20 to 30 clones per sucrose gradient 
10 fraction, the codes were broken and results analysed in 
parallel with the RT-activity data. 

Table 5 presents the distribution of sequences obtained 
from sucrose gradient fractions containing the peak of 
viral RT-activity in MS-derived cultures and also the 

15 sequences amplified from the corresponding RT-activity 
negative fractions of uninfected cultures. The predominant 
sequence detected in bands of the expected size (2140 bp) 
amplified in all the RT-activity positive fractions (but 
not in the RT-activity negative fractions) was different 

20 from known retroviruses and was designated MSRV-cpol. 
MSRV-cpol sequences exhibited partial homology (70-75%) 
with ERV9, a previously described endogenous retroviral 
sequence (18). A few ERV9 sequences (>90% homology with 
ERV9) were also present but clearly represented a minority 

25 of clones. In addition to typical pol sequences, numerous 
PGR artefacts (primer multimers, concatemers or single- 
primer amplifications) related to the use of degenerate 
primers and low-temperature annealing, were found in all 
samples (Table 5) . 

30 Figure 44 shows an alignment of a consensus sequence of 
MSRV-cpol with the corresponding VLPQG / YMDD region of 
diverse retroviruses. Figure 4 5 displays a phylogenic tree 
based on the evolutionarily conserved amino acid sequences 
of both exogenous and endogenous retroviruses in this 

35 region. From this tree it can be seen that the pol gene of 
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MSRV is phylogenically related to the C-type group of 
oncovirinae . 

A small scale study was performed to determine the 
prevalence of MSRV c-pol sequences in the CSF of patients 
5 with MS. Identification of MSRV-cpoI in PGR products by 
cloning and sequencing is both laborious and time 
consuming. We therefore devised an enzyme-linked 
oligosorbent assay (ELOSA) , using a capture probe (CpVlB) 
and a peroxidase-labelled detector probe (DpVl) , for the 

10 rapid identification of MSRV-cpoi sequences in ""Pan- 
retrovirus' PGR products (Figure 44). The specificity of 
this sandwich hybridisation-based assay for HMSRV-cpoI was 
tested with both distantly related (HIV and MoMLV) and 
closely related (ERV9) pol sequences. No significant cross 

15 reactivity with such targets was observed despite the 
ability of the ELOSA to detect as little as 0.01 ng of 
MSRV-cpoi DNA. 

Cerebrospinal fluid (GSF) samples were available from 10 
patients with MS and from 10 patients with other 
20 neurological disorders. Total RNA was extracted from CSF 
pellets, reverse transcribed and amplified as above. ELOSA 
analysis (Table 6) of the PGR products revealed MSRV-cpoi 
sequences in 5 of the 10 MS patient samples but in none of 
the 10 samples from patients with other neurological 

2 5 diseases (P<0.05). The presence of MSRV-cpoi did not 

appear to be correlated with age, sex or type of MS, but 
was seen in untreated patients only (5/6) . No patient with 
immunosuppressive therapy was found positive (0/4) . No 
correlation between MSRV-cpoi detection and CSF cell count 

3 0 was observed. 

- Cloning and sequencing a larger region of the pol gene 

An independent identification of the MSRV 
genomic sequence was obtained by a non-PCR approach using 
RNA extracted from concentrated virions derived from 2,5 
35 liters of LM7-infected sub-cultures of choroid plexus 
cells. A limited number of clones was obtained by direct 
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cloning of the cDNA, one of which (PSJ17) showed partial 
hoinology with ERV9 pol. Specific primers based on the 
MSRV-cpol region and on the PSJ17 clone, amplified a 740 
bp fragment linking the two independent sequences in RNA 
5 extracted from purified virions, PSJ17 was localised on 
the 3' side of MSRV-cpol. Further sequence extension on 
the 5' side of MSRV-cpol and on the 3* side of PSJ17 , was 
obtained using RT-PCR approaches on RNA from purified L.M7- 
like virions produced in MS choroid plexus cultures (4) . 

10 In Figure 46, the nucleotide sequence 

corresponding to overlapping clones obtained by sequence 
extension in the pol gene is represented with the 
aminoacid translation corresponding to the putative open 
reading frames (ORFs) of the protease and of the reverse- 

15 transcriptase. The active site motifs of the protease 
(PRT) and of the reverse-transcriptase (RT) are 
underlined. In the C-terminal region of the RT sequence, 
the dispersed amino acid residues regularly present in 
retroviral RNase H domains, are also underlined. 

20 - Non-degenerate primers detect MSRV-specif ic RNA in 
virions associated with the peak of RT-activity . and in 
in MS patients' plasma 

PGR primers (STl.l primer set; positions 603-625/1732- 
1714, on Fig. 4) based on overlapping clones in the pol 

25 gene, amplified a 1.15 kb segment of the RT region from 
several different isolates obtained from different MS 
patients. Nested primers (ST1.2; positions 869-889/1513- 
1490, on Fig. 46) generated a 700 bp fragment (Figure 47) 
which was more easily visualised by ethidium bromide 

30 staining than the first round product generated by STl.l. 
The specificity of PGR products was confirmed by stringent 
hybridisation with a peroxidase-labeled MSRV-cpol probe 
(Fig. 44), using the ELOSA technique (21). 

The STl.l and 2 primer set was used to detect 
35 extracellular MSRV RNA in human plasma, although non- 
optimal for this application. Figure 47 illustrates the 
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results of PGR amplification of cDNA derived from 2 MS 
patient and 2 control plasma samples tested in parallel 
with cDNA from the sucrose density gradient fractions of 
an MS choroid plexus isolate. Taq-sequencing of the 700 bp 
5 bands confirmed the presence of MSRV sequence. A very 
faint 700 bp band is also visible in fraction 10 which 
corresponds to the bottom of the tube where aggregated 
particles usually sediment. Control RT-PCR for cellular 
aldolase transcripts on plasma-derived RNA was negative, 

10 indicating that the results were not due to cellular RNA 
released by cell lysis during plasma separation. It should 
be noted that this PGR technique was not designed for 
epidemiological studies since its sensitivity is impaired 
by the length of the cDNA required (1.15 kb) . 

15 Non degenerate primers amplifying three 

fragments of the pol gene (the whole protease region, 
regions A and B of the reverse transcriptase; Gf . Fig. 46) 
were also used to confirm the presence of MSRV sequences 
in DNase-treated RNA from MS plasma. These fragments were 

20 amplified from the plasma of a further 4 MS patients with 
active disease. Sequence analysis confirmed that the PRT 
and RT regions were homologous (>95% and >90% 
respectively) to MSRV sequences previously obtained on 
culture virion. No such sequence were detected in plasma 

25 from healthy controls (n=4), tested in parallel with MS 
plasma , 
3) DISCUSSION 
- Phylogeny of MSRV 

From the results of this study, it can be 

30 concluded that the virus previously referred to as "LM7" 
(3, 5, 26) posseses an RNA genome containing the MSRV pol 
sequences described here. 

The conserved RT motif of both MSRV and ERV9 is two amino 
acids shorter than that of other retroviruses, apart from 
35 human foamy viruses which nonetheless have a functional 
RT. The potential ORF encompassing the entire PRT-RT 
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region is consistent with the virion-associated RT- 
activity detected in sucrose density gradients with 
infected culture supernatants • Moreover, since we have 
recently succeeded in expressing a recombinant protein 
5 from the sequence of MSRV protease cloned from MS plasma, 
we can confirm the reality of the potential PRT ORF, 
Similar cloning and expression of other sequences 
containing potential ORFs for MSRV proteins, is being 
undertaken to confirm their ability to encode enzymes and 

10 structural proteins of MSRV virions. 

The phylogenic tree in Figure 45, based on the most 
conserved amino acid sequence in retroviruses 
(VLPQG. . , YXDD) , shows that the MSRV pol gene is related to 
the C-type oncoviruses. Apart from ERV9 , the closest known 

15 retroviral element is RTLV-H, a human endogenous sequence 
known to have a subtype with a functional pol gene (32). 
In the pol region, this phylogenic affiliation to C-type 
oncoviruses apparently contradicts our previous 
assumptions based on the general morphology of the 

20 particles observed by electron microscopy (EM) , which were 
compatible with a B or D-type oncovirus (3, 5, 26). 
However, preliminary data on env sequences detected in 
MSRV virions, would suggest a greater phylogenic proximity 
to D-type. Such difference in phylogenies of the pol and 

25 env genes have been described in MPMV and suggest a 
recombinatorial origin in D-type retroviruses (33) . D to C 
type morphological conversion is also possible since it 
has been reported that a single amino acid substitution in 
the gag protein can convert retrovirus morphology to that 

30 of a different type (34) , 

- Is MSRV an exogenous retrovirus sharing extensive 
homology with a related endogenous retrovirus family or an 
endogenous retrovirus producing extracellular virions? 

Southern blot analysis with an MSRV pol probe 

35 under stringent conditions, showed hybridisation with a 
multicopy endogenous family (data not presented) , 
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indicating the existence of endogenous elements more 
closely related to MSRV than ERV9 itself. Consequently, we 
were unable to look for a vir ion-specif ic provirus in 
^SRV-producing cells. In agreement with southern blot 
5 findings, PGR studies on genomic DNA showed multiple band 
amplification of MSRV-related endogenous sequences. Since 
pol is the most conserved retroviral gene, the sequence 
described here is the least suitable region to 
discriminate between exogenous and endogenous sequences. 

10 It is hoped that sequence information from other parts of 
the genome may permit such a discrimination, would it be 
on a tiny portion as has recently been demonstrated for 
the Jaagsiekte retrovirus (JSRV) of sheep (35) . With such 
sequence data, it would then become possible to identify 

15 the MSRV-specif ic provirus in the genome of vir ion- 
producing cell cultures. 

MSRV could represent a virion-producing exogenous member 
of an ERV9-like endogenous family, just as exogenous 
strains exist in the well-studied mouse mammary tumour 

20 virus (MMTV) and murine leukaemia virus (MuLV) retroviral 
families of mice, and also, in the JSRV retroviral family 
of sheep (36). Alternatively, it is also conceivable that 
the extracellular MSRV virions may be produced by a 
replication-competent endogenous provirus. Wether MSRV is 

25 exogenous or endogenous, conceptual similarities exist 
with the category of retroviruses represented by MuLV, 
MMTV and JSRV- Unlike defective endogenous elements, this 
category of agents are known to produce infectious and 
pathogenic virions, to cause neurological disease (37), 

30 solid tumours / leukaemias (36, 38) and to express 
"endogenous superantigens" (39, 40). Furthermore, in MuLV 
infections, the genetic endogenous retroviral background 
of the mouse strain can determine susceptibility or 
resistance to disease (39, 41). Indeed, such interactions 

35 between an infectious retrovirus and its endogenous 
counterpart may be relevant in the pathogenesis of MS, 
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since endogenous retroviral genotypes are not identical in 
all individuals. A genetic control due to related 
endogenous retroviral genotypes could therefore contribute 
to the known hereditary susceptibility to MS (43) , if MSRV 
5 does indeed play an active role in this disease. 

Elsewhere, the data in Table 5 suggest that ERV9 elements 
may be co-expressed, possibly via trans-activation in 
infected cells, and give rise to heterologous RNA 
packaging in MSRV virions. Such heterologous packaging is 
10 known to occur in other retroviral systems (42) . 

- A role for the numerous common viruses previously evoked 
in MS ? 

Among the numerous reports of viruses putatively 
involved in the aetiopathogenesis of MS, a significant 

15 proportion focus on two viral families, the 
paramyxoviridae and the herpesviridae . Regarding the 
paramyxoviridae, the key observation is of a frequently 
increased antibody titer to measles virus in MS patients 
essentially directed, in CSF, against measles fusion 

20 protein (44) . The existence of aminoacid similarities 
between conserved domains of the fusion proteins of 
paramyxoviridae and the transmembrane protein of 
retroviruses (45) , may explain this observation if 
antigenic cross-reactivity between these two proteins 

25 occur ed. 

With regard to the herpesvirus family, the involvement of 
Epstein-Barr Virus (EBV) , Herpes Simplex Virus type 1 
(HSV-1) and, most recently. Human Herpes Virus 6 (HHV-6) 
has been proposed (31, 46, 47). From our previous studies 

3 0 and from those of other groups, it appears that 
herpesviruses may play an important role in MSRV 
expression: we have shown that HSV-1 immediate-early ICPO 
and ICP4 proteins can transactivate MSRV/LM7 in vitro (6) 
and Haahr et al. have proposed an important 

35 epidemiological role for EBV, as a co-factor in MS, 
triggering retrovirus reactivation (31), The recent 
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description by Challoner et al. (47) showing significant 
expression of HHV6 proteins in MS plaques may also suggest 
a similar role for HHV6 in the brain. 

5 EXAMPLE 19 : MSRV GENOME DETECTION TECHNIQUE 

Following 0.4 iim filtration to remove cellular 
debris and RNase digestion to remove residual non- 
encapsidated RNA, serum was processed to extract viral RNA 
by means of adsorption to a silica matrix. Viral RNA was 

10 subjected to DNase digestion, then a combined reverse 
transcription-PCR (RT-PCR) reaction was performed using 
primers PTpol-A (sense: 5'xxxx3', SEQ ID NO: 18 3) and 
PTpol-F (antisense: 5 • xxxx3 ' , SEQ ID NO:184). A second 
round of amplification with nested primers PTpol-B (sense: 

15 5«xxxx3', SEQ ID NO: 185) and PTpol-E (antisense: 5 ' xxxx3 ' , 
SEQ ID NO: 186) generated a 435 bp PGR product which was 
identified by gel electrophoresis. The specificity of each 
product was confirmed by dideoxy sequencing. Control 
reactions without reverse transcriptase were performed to 

20 ensure that the products were derived from viral RNA. In 
addition, to exclude the possibility that the extracted 
viral RNA might be contaminated with host cell derived 
nucleic acids, aliquots were tested by nested PGR for the 
presence of pyruvate dehydrogenase (PDH) DNA and RNA. 

2 5 Samples which generated a signal in either the PDH or the 
"no-RT" PGR assays were excluded from the analysis. 

Sera from patients with clinically active MS and 
controls were amplified by RT-PCR and sequenced. Virion 
associated MSRV-RNA was detected in the serum of 10 of 19 

30 (53%) patients with MS but in only 3 of 44 controls 
without MS (P=0.0001). The control group consisted of 8 
patients (all MSRV-RNA negative) with rheumatological 
disorders and 3 6 healthy adults. MSRV-RNA titres in both 
MS patients and controls were apparently low because even 

35 moderate dilution of sera (<10 fold) caused loss of 
signal . 
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In MS patients, detection of MSRV-RNA was not 
associated with age, sex, disease duration, or MS type, 
however a significant negative correlation with treatment 
was observed. 26 serum samples were obtained from the 19 
5 patients ; 100% of the sera from untreated patients 
contained detectable MSRV-RNA whereas it was detectable in 
only 4 of 19 samples (21%) obtained during treatment with 
corticosteroids and/or azathioprine (P=0.001). 

The reason for the apparent loss of virion 
10 associated MSRV-RNA during immunosupressive treatment is 
unknown but the finding is in agreement with the previous 
observations on the detection of MSRV in cerebrospinal 
fluid. 

15 TABLE 7 

DETECTION OF VIRION ASSOCIATED MSRV-RNA IN MS UNTREATED 

PATIENTS & CONTROLS 





Positive 


Negative 


Total 


% Positive 


Controls without MS^ 




41 


44 


7% 












MS sera untreated at 
time of sampling 


7 


0 


7 


100% 



20 ^ The control group consisted of 8 patients with 
miscellaneous non-MS disorders and 3 6 healthy adults. 
^ The detection of MSRV RNA in plasma of a few controls in 
conditions which select vir ion-packaged RNA , is consistent 
with the knowledge that a virus associated with MS should 

25 be present in a minor proportion of apparently healthy 
population . Indeed , such individuals can be either healthy 
carriers or be in the pre-clinical (or sub-clinical) phase 
of the disease which can last for years. 

30 
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METHOD : 

- Modified SNAP RNA extraction with filtration and RNase 
digestion 

(All centrif ugations are at room temperature) 
5 Up to 500 microlitres of serum is filtered using 

0.45 micron spin filters (Nanosep MF from Flowgen 
Catalogue No. U3-0126 Ref. 0DM45) , The serum is spun for 
5 min at 130,000 g (or for further 10 min if necessary). 

150 microlitres of filtered serum is incubated 

10 with 10 units RNase One (Promega Catalogue No.M4261) for 
30 min at ST^'C. 

The 150 microlitres was then extracted using the 
SNAP RNA extraction kit (Invitrogen) as below: 

- 10 micrograms of poly A RNA was added to the 

15 450 microlitres of Binding Buffer to act as a carrier ; 
this was then added to the serum and mixed by inversion 6 
times ; 300 microlitres of propan-2-ol was then added and 
mixed by inversion 10 times ; 500 microlitres was 
transferred to the SNAP column and spun at 13 00 g for 

20 1 min and the flow-through discarded ; the remainder was 
then added to the SNAP column and spun at 13 00 g for 1 min 
and the flow-through discarded ; the column was then 
washed with 600 microlitres of Super wash and the flow- 
through discarded ; the column was then washed with 600 

25 microlitres of Ix RNA wash and the flow-through 
discarded ; this wash was repeated with a 2 min 1300 g 
spin and the flow-through discarded ; the bound nucleic 
acid was then eluted by incubating with 135 microlitres of 
RNase free water for 5 min and spun at 1300 g for 1 min. 

30 - 15 microlitres of lOx DNAse buffer and 3 

microlitres (30 units) of DNase I, RNase free (Boehringer 
Mannheim Cat. No. 776 785) was added and incubated for 30 
min at 37**C ; 450 microlitres of Binding Buffer was added 
and mixed by inversion 6 times ; 300 microlitres of 

35 propan-2-ol was then added and mixed by inversion 10 
times ; 500 microlitres was transferred to the SNAP column 
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and spun at 1300 g for 1 min and the flow-through 
discarded ; the remainder was then added to the SNAP 
column and spun at 13 00 g for 1 min and the flow-through 
discarded ; the column was then washed with 600 
5 microlitres Ix RNA wash and the flow-through discarded ; 
this wash was repeated with a 2 min 13 00 g spin and the 
flow-through discarded ; the bound nucleic acid was then 
eluted by incubating with 105 microlitres of RNase free 
water for 5 min and spun at 1300 g for 1 min. 

10 

- Titan RT-PCR 

RT-PCR was performed using the Titan one tube RT- 
PCR system (Boehringer Mannheim Cat. No. 1 855 476) 25 
microlitres of RNA was used in the combined RT-PCR 

15 reaction. The total reaction volume was 50 microlitres. 
Promega rRNAsin (10 units) was the RNase inhibitor used. 
170 ng of primers SEQ ID NO: 183 and SEQ ID NO: 184, 
respectively, were used. A single master mix was prepared 
and the sample RNA added last. This was performed at room 

2 0 temperature, not on ice. 

The RT step consisted of two sequential 3 0 min 
incubations at 50°C and then SO^'C. This was immediately 
followed by the PCR which had the following steps, 

* Initial denaturation of template at 94 ^C for 2 min, 

25 * 40 cycles of 94^C for 30 seconds ; 60^C for 30 seconds ; 
68 °C for 45 seconds, 

* 1 cycle of 68**C for 7 min. 

The second round PCR was performed using the 
Expand long template PCR system (Boehringer Mannheim Cat. 
30 No. 1681 842). 0.5 microlitres of the RT-PCR mix was added 
to 25 microlitres of the round 2 PCR mix. Buffer No. 3 and 
50 ng of primers B and E were used. The PCR had the 
following steps: 

* 5 cycles of 94'*C for 30 seconds, 60°C for 30 seconds., 
35 68^C for 45 seconds, 

* 1 cycle of 68°C for 7 min. 
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The PGR products were then run on a 2% agarose 

gel . 

The no RT controls were performed using "Expand" 
PGR system for both rounds. The first round was 4 0 cycles 
5 and the second round 2 0 cycles. 

As a positive control a DNA dilution series was 
used in both the RT-PGR and the "no RT" PGR. For a result 
to be valid the RT-PGR and "no-RT" PGRs had to have 
detected DNA equivalent to between 1 and 0.1 cells. 
10 The analysis of PGR products of an approximately 

435 bp fragment in the pol region is shown in Table 8. 

TABLE 8 



15 



20 



ANALYSIS OF PGR PRODUGTS WITH ORF 



Exp 


Disease 


Clone 


ORF 


Fragment (bp) 


AA-RT Motif Site 


46-7 


MS 


1 




429 


YGDD 






5 


+ 


429 


YGDD 






8 


+ 


429 


YGDD 


6B-1 


MS 


41 




438 


YMDD 






42 




438 


YMDD 






43 




438 


YMDD 



25 * Defective RNA can also be present in circulating 
virions, since the fidelity of the MSRV reverse 
transcriptase appears to be low and since recombination 
events with related endogenous elements can occur. It is 
then obvious that the intra- and inter- patients 

30 variability can be greater than that illustrated in this 
example, because of these encapsidated defective MSRV RNA 
copies . 

Table 9 which data have been determined from the 
35 alignments of Figures 49 to 53, shows a variability : 
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- between the clones obtained from the same patient plasma 
sample in the same PGR amplification experiment ; this 
means that the patient possesses a virion population which 
comprises different MSRV variants at a given time, 
5 - between the sequenced variant populations from different 
patients ; this means that the variants differ from a 
patient to another patient, 

TABLE 9 

10 Degree of identity (percentage) between nucleotide 

sequences and between peptide sequences , 
by direct comparison of said sequences (see Figures 49-53) 



Patient 


68-1 


46-7 


Nucleotide 
sequences 


between SEQ ID NO: 169 

and MSRV-pol (SEQ ID NO:l) 

90,4 % ^ 


between SEQ ID NO: 176 
and MSRV-pol (SEQ ID NO:l) 
82,5 % ^ 




92,3 % ^ 


84 % ^ 




SEQ ID NOs:170, 171, 
172 between them 

98,6 % ^ 


SEQ ID NOs:177, 178, 
179 between them 

94,5 % ^ 




98,7 % ^ 


95,1 % ^ 


Peptide 
sequences 


between SEQ ID NOs:173, 
174, 175 and SEQ ID NO: 
81 % 


between SEQ ID NOs:180, 
181, 182 and SEQ ID NO: 
73,5 % 




SEQ ID NOs:173, 174, 175 
between them 

97 % 


SEQ ID NOs:180, 181, 182 
between them 

89 % 



15 a) this percentage is determined on the basis of sequences 
excluding the primers 

b) this percentage is determined on the basis of sequences 
including the primers. 

20 From Figures 53A and 53B, the variability between tested 
patients sequences can be determined : 
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- between SEQ ID NO: 169 and SEQ ID NO: 176 : 16,5 %^ and 
14,8 %^ 

- between the peptide sequences obtained from 
SEQ ID NO: 169 and SEQ ID NO: 176 : 20 %. 

5 

Four microorganisms are mentioned in the 
specification page 3 lines 15-26 and they are identified 
below. They have all been deposited with the ECACC* , in 
accordance with the provisions of the Budapest Treaty. 

10 

- LM7PC deposited on 22nd July 1992 under No. 92072201, 

- PLI-2 deposited on 8th January 1993 under No. 93010817, 

- POL-2 deposited on 22nd July 1992 under No. V92072202, 
and 

15 - MS7PG deposited on 8th January 1993 under No. V93010816. 

* ECACC : European Collection of Animal Cell Cultures 
Vaccine Research and Production Laboratory 
Public Health Laboratory Service 
20 Centre of Applied Microbiology and Research 

Porton Down 

Salisbury, Wiltshire SP4 OJG 
United Kingdom 

25 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 4 72 69 84 30 

(B) TELEFAX: 4 72 69 84 31 

5 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1158 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 





CCCTTTGCCA 


CTACATCAAT 


TTTAGGAGTA 


AGGAAACCCA 


ACGGACAGTG 


GAGGTTAGTG 


60 




CAAGAACTCA 


GGATTATCAA 


TGAGGCTGTT 


GTTCCTCTAT 


ACCCAGCTGT 


ACCTAACCCT 


120 




TATACAGTGC 


TTTCCCAAAT 


ACCAGAGGAA 


GCAGAGTGGT 


TTACAGTCCT 


GGACCTTAAG 


180 


20 


GATGCCTTTT 


TCTGCATCCC 


TGTACGTCCT 


GACTCTCAAT 


TCTTGTTTGC 


CTTTGAAGAT 


240 




CCTTTGAACC 


CAACGTCTCA 


ACTCACCTGG 


ACTGTTTTAC 


CCCAAGGGTT 


CAGGGATAGC 


300 




CCCCATCTAT 


TTGGCCAGGC 


ATTAGCCCAA 


GACTTGAGTC 


AATTCTCATA 


CCTGGACACT 


360 




CTTGTCCTTC 


AGTACATGGA 


TGATTTACTT 


TTAGTCGCCC 


GTTCAGAAAC 


CTTGTGCCAT 


420 




CAAGCCACCC 


AAGAACTCTT 


AACTTTCCTC 


ACTACCTGTG 


GCTACAAGGT 


TTCCAAACCA 


480 


25 


AAGGCTCGGC 


TCTGCTCACA 


GGAGATTAGA 


TACTNAGGGC 


TAAAATTATC 


CAAAGGCACC 


540 




AGGGCCCTCA 


GTGAGGAACG 


TATCCAGCCT 


ATACTGGCTT 


ATCCTCATCC 


CAAAACCCTA 


600 




AAGCAACTAA 


GAGGGTTCCT 


TGGCATAACA 


GGTTTCTGCC 


GAAAACAGAT 


TCCCAGGTAC 


660 




ASCCCAATAG 


CCAGACCATT 


ATATACACTA 


ATTANGGAAA 


CTCAGAAAGC 


CAATACCTAT 


720 




TTAGTAAGAT 


GGACACCTAC 


AGAAGTGGCT 


TTCCAGGCCC 


TAAAGAAGGC 


CCTAACCCAA 


780 


30 


GCCCCAGTGT 


TCAGCTTGCC 


AACAGGGCAA 


GATTTTTCTT 


TATATGCCAC 


AGAAAAAACA 


840 




GGAATAGCTC 


TAGGAGTCCT 


TACGCAGGTC 


TCAGGGATGA 


GCTTGCAACC 


CGTGGTATAC 


900 




CTGAGTAAGG 


AAATTGATGT 


AGTGGCAAAG 


GGTTGGCCTC 


ATNGTTTATG 


GGTAATGGNG 


960 




GCAGTAGCAG 


TCTNAGTATC 


TGAAGCAGTT 


AAAATAATAC 


AGGGAAGAGA 


TCTTNCTGTG 


1020 




TGGACATCTC 


ATGATGTGAA 


CGGCATACTC 


ACTGCTAAAG 


GAGACTTGTG 


GTTGTCAGAC 


1080 


35 


AACCATTTAC 


TTAANTATCA 


GGCTCTATTA 


CTTGAAGAGC 


CAGTGCTGNG 


ACTGCGCACT 


1140 




TGTGCAACTC 


TTAAACCC 
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(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 297 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 (ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



CCCTTTGCCA CTACATCAAT TTTAGGAGTA 

15 CAAGAACTCA GGATTATCAA TGAGGCTGTT 

TATACAGTGC TTTCCCAAAT ACCAGAGGAA 

GATGCCTTTT TCTGCATCCC TGTACGTCCT 

CCTTTGAACC CAACGTCTCA ACTCACCTGG 



AGGAAACCCA ACGGACAGTG GAGGTTAGTG 60 

GTTCCTCTAT ACCCAGCTGT ACCTAACCCT 120 

GCAGAGTGGT TTACAGTCCT GG ACCTTAAG 180 

GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 240 

ACTGTTTTAC CCCAAGGGTT CAAGGGA 297 



20 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 



30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GTTTAGGGAT ANCCCTCATC TCTTTGGTCA GGTACTGGCC CAAGATCTAG GCCACTTCTC 60 
AGGTCCAGSN ACTCTGTYCC TTCAG 85 



35 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GTTCAGGGAT AGCCCCCATC TATTTGGCCA GGCACTAGCT CAATACTTGA GCCAGTTCTC 60 
ATACCTGGAC AYTCTYGTCC TTCGGT 86 

15 

<2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 85 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

GTTCARRGAT AGCCCCCATC TATTTGGCCW RGYATTAGCC CAAGACTTGA GYCAATTCTC 60 

3 0 ATACCTGGAC ACTCTTGTCC TTYRG 85 

(2) INFORMATION FOR SEQ ID NO: 6: 

3 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

GTTCAGGGAT AGCTCCCATC TATTTGGCCT GGCATTAACC CGAGACTTAA GCCAGTTCTY 60 
10 ATACGTGGAC ACTCTTGTCC TTTGG 85 

(2) INFORMATION FOR SEQ ID NO: 7: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

25 

GTGTTGCCAC AGGGGTTTAR RGATANCYCY CATCTMTTTG GYCWRGYAYT RRCYCRAKAY 60 
YTRRGYCAVT TCTYAKRYSY RGSNAYTCTB KYCCTTYRGT ACATGGATGA C 111 

3 0 (2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 645 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

5 



TCAGGGATAG 


CCCCCATCTA 


TTTGGCCAGG 


CATTAGCCCA 


AGACTTGAGT 


CAATTCTCAT 


60 


ACCTGGACAC 


TCTTGTCCTT 


CAGTACATGG 


ATGATTTACT 


TTTAGTCGCC 


CGTTCAGAAA 


120 


CCTTGTGCCA 


TCAAGCCACC 


CAAGAACTCT 


TAACTTTCCT 


CACTACCTGT 


GGCTACAAGG 


180 


TTTCCAAACC 


AAAGGCTCGG 


CTCTGCTCAC 


AGGAGATTAG 


ATACTNAGGG 


CTAAAATTAT 


240 


CCAAAGGCAC 


CAGGGCCCTC 


AGTGAGGAAC 


GTATCCAGCC 


TATACTGGCT 


TATCCTCATC 


300 


CCAAAACCCT 


AAAGCAACTA 


AGAGGGTTCC 


TTGGCATAAC 


AGGTTTCTGC 


CGAAAACAGA 


360 


TTCCCAGGTA 


CASCCCAATA 


GCCAGACCAT 


TATATACACT 


AATTANGGAA 


ACTCAGAAAG 


420 


CCAATACCTA 


TTTAGTAAGA 


TGGACACCTA 


CAGAAGTGGC 


TTTCCAGGCC 


CTAAAGAAGG 


480 


CCCTAACCCA 


AGCCCCAGTG 


TTCAGCTTGC 


CAACAGGGCA 


AGATTTTTCT 


TTATATGCCA 


540 


CAGAAAAAAC 


AGGAATAGCT 


CTAGGAGTCC 


TTACGCAGGT 


CTCAGGGATG 


AGCTTGCAAC 


600 


CCGTGGTATA 


CCTGAGTAAG 


GAAATTGATG 


TAGTGGCAAA 


GGGTT 




645 



(2) INFORMATION FOR SEQ ID NO: 9: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 741 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

30 



CAAGCCACCC 


AAGAACTCTT 


AAATTTCCTC 


ACTACCTGTG 


GCTACAAGGT 


TTCCAAACCA 


60 


AAGGCTCAGC 


TCTGCTCACA 


GGAGATTAGA 


TACTTAGGGT 


TAAAATTATC 


CAAAGGCACC 


120 


AGGGGCCTCA 


GTGAGGAACG 


TATCCAGCCT 


ATACTGGGTT 


ATCCTCATCC 


CAAAACCCTA 


180 


AAGCAACTAA 


GAGGGTTCCT 


TAGCATGATC 


AGGTTTCTGC 


CGAAAACAAG 


ATTCCCAGGT 


240 


ACAACCAAAA 


TAGCCAGACC 


ATTATATACA 


CTAATTAAGG 


AAACTCAGAA 


AGCCAATACC 


300 


TATTTAGTAA 


GATGGACACC 


TAAACAGAAG 


GCTTTCCAGG 


CCCTAAAGAA 


GGCCCTAACC 


360 
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CAAGCCCCAG TGTTCAGCTT GCCAACAGGG 

ACAGGAATCG CTCTAGGAGT CCTTACACAG 

TACCTGAATA AGGAAATTGA TGTAGTGGCA 

GNGGCAGTAG CAGTCTNAGT ATCTGAAGCA 

5 GTGTGGACAT CTCATGATGT GAACGGCATA 

GACAACCATT TACTTAANTA TCAGGCTCTA 

ACTTGTGCAA CTCTTAAACC C 



128 

CAAGATTTTT CTTTATATGG CACAGAAAAA 420 

GTCCGAGGGA TGAGCTTGCA ACCCGTGGCA 480 

AAGGGTTGGC CTCATNGTTT ATGGGTAATG 540 

GTTAAAATAA TACAGGGAAG AGATCTTNCT 600 

CTCACTGCTA AAGGAGACTT GTGGTTGTCA 660 

TTACTTGAAG AGCCAGTGCT GNGACTGCGC 720 

741 



10 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

TGGAAAGTGT TGCCACAGGG CGCTGAAGCC TATCGCGTGC AGTTGCCGGA TGCCGCCTAT 60 
AGCCTCTACA TGGATGACAT CCTGCTGGCC TCC 93 

25 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
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TTGGATCCAG TGYTGCCACA GGGCGCTGAA GCCTATCGCG TGCAGTTGCC GGATGCCGCC 60 
TATAGCCTCT ACGTGGATGA CCTSCTGAAG CTTGAG 96 



5 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 748 base pairs 
10 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 





TGCAAGCTTC 


ACCGCTTGCT 


GGATGTAGGC 




TTCGATGTAG 


AAAGCGCCCG 


GAAACACGCG 


20 


CGCCTCGTTG 


CCATTGGCCA 


GCGCCACGCC 




CAGCAGACCG 


GCGGCCAGCG 


GCGCATTCTC 




GATTTCCGCA 


CGACCGCGAT 


GCTGGTTGGA 




GTTCAGGTAA 


CCCTGCTTGT 


CCCGCACCAA 




GTCGTGATTG 


GTGATCCACA 


CGTCAGCCCC 


25 


TTCCTTGTAG 


ANGCGCACCA 


GCCCGAAGGC 




CATGCCATCT 


TTGGCGGCAG 


CCTTGACGGC 




GGAATATTCG 


GAGTGGAGAC 


GGAGGTGGAC 




ACGGGTGACA 


CCTTCCGCAA 


AGCATTCCGG 




ACGGCTGCGC 


GGGCAGTTAT 


AATTTCGGCT 


30 


AAGCCTATCG 


CGTGCAGTTG 


CCGGATGC 



CTCAGTACCG GNGTGCCCCG CGCGCTGTAG 60 

GGACCAATGC GTCGCCAGCT TGCGCGCCAG 120 

GATATCACCC GCCATGGCGC CGGAGAGCGC 180 

AACGCCGGGC TCGTCGAACC ATTCGGGGGC 240 

GAGCCAGGCC CTGGCCAGCA ACTGGCACAG 300 

CAGCAGCAGG CGGGTCGGCT TGTCGCGCTC 360 

GACGATGGGC TTCACGCCCT TGCCACGCGC 420 

ATTGGCGAGA TCGGTCAGCG CCAAGGCGCC 480 

ATCGTCGAGA CGGACATTGC CATCGACGAC 540 

GAAGCGCGGC GAATTCATCC GCGTATTGTA 600 

ACGTGCCCGA TTGACCCGGA GCAACCCCGC 660 

TACGAATCAA CGGGTTACCC CAGGGCGCTG 720 

748 



(2) INFORMATION FOR SEQ ID NO: 13: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GCATCCGGCA ACTGCACG 18 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GTAGTTCGAT GTAGAAAGCG 20 

(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
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GCATCCGGCA ACTGCACG 18 

5 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

AGGAGTAAGG AAACCCAACG GAC 2 3 

20 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

TAAGAGTTGC ACAAGTGCG 19 

3 5 (2) INFORMATION FOR SEQ ID NO: 18: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

10 

TCAGGGATAG CCCCCATCTA T 21 
(2) INFORMATION FOR SEQ ID NO: 19: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
2 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

25 

AACCCTTTGC CACTACATCA ATTT 24 



(2) INFORMATION FOR SEQ ID NO: 20: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(ix) FEATURES: 

(B) LOCATION: 5, 1, 10, 13 
> (D) OTHER INFORMATION: G represents inosine (i) 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

GGTCGTGCCG CAGGG 15 

(2) INFORMATION FOR SEQ ID NO; 21: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TTAGGGATAG CCCTCATCTC T 21 

25 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

3 5 (ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
TCAGGGATAG CCCCCATCTA T 21 

5 

(2) INFORMATION FOR SEQ ID NO: 23: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 
10 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
AACCCTTTGC CACTACATCA ATTT 2 4 



20 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

3 0 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 24: 
GCGTAAGGAC TCCTAGAGCT ATT 23 

35 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



15 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TCATCCATGT ACCGAAGG 18 

(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

ATGGGGTTCC CAAGTTCCCT 20 



3 0 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



BNSDOCID:<WO 9823755A1> 



wo 98/23755 PCT/IB97/01482 



136 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

5 

GCCGATATCA CCCGCCATGG 20 
(2) INFORMATION FOR SEQ ID NO: 28: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

20 

GCATCCGGCA ACTGCACG 18 
(2) INFORMATION FOR SEQ ID NO: 29: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 0 (D) TOPOLOGY; linear 

(11) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

35 

CGCGATGCTG GTTGGAGAGC 20 



BNSOOCID- <WO 9823755A1> 



wo 98/23755 PCT/IB97/01482 



137 



(2) INFORMATION FOR SEQ ID NO: 30: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
15 TCTCCACTCC GAATATTCCG 20 



(2) INFORMATION FOR SEQ ID NO: 31: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

3 0 GATCTAGGCC ACTTCTCAGG TCCAGS 



(2) INFORMATION FOR SEQ ID NO: 32: 

3 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

( ix ) FEATURES : 

(B) LOCATION: 6, 12, 19 

(D) OTHER INFORMATION: G represents inosine (i) 
(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32 
CATCTGTTTG GGCAGGCAGT AGC 2 3 



15 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CTTGAGCCAG TTCTCATACC TGGA 24 

30 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 
5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

AGTGYTRCCM CARGGCGCTG AA 22 

10 (2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GMGGCCAGCA GSAKGTCATC CA 22 
(2) INFORMATION FOR SEQ ID NO: 36: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

35 

GGATGCCGCC TATAGCCTCT AC 22 
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(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 



10 



(ii) MOLECULE TYPE: CDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
15 AAGCCTATCG CGTGCAGTTG CC 22 

(2) INFORMATION FOR SEQ ID NO: 38: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

3 0 TAAAGATCTA GAATTCGGCT ATAGGCGGCA TCCGGCAAGT 40 



(2) INFORMATION FOR SEQ ID NO: 39 

35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 50 amino acids 
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(B) TYPE : amino acid 
(ii) MOLECULE TYPE : peptide 

5 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 39 

Asp Ala Phe Phe Cys lie Pro Val Arg Pro Asp Ser GLn Phe Leu Phe 

15 10 15 

Ala Phe Glu Asp Pro Leu Asn Pro Thr Ser Gin Leu Thr Trp Thr Val 
10 20 25 30 

Leu Pro Gin Gly Phe Arg Asp Ser Pro His Leu Phe Gly Gin Ala Leu 
35 40 45 

Ala Gin 
50 

15 

(2) INFORMATION FOR SEQ ID NO: 40 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 150 base pairs 
20 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 



25 



(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 40 



GATGCCTTTT TCTGCATCCC TGTACGTCCT GACTCTCAAT TCTTGTTTGC CTTTGAAGAT 60 
CCTTTGAACC CAACGTCTCA ACTCACCTGG ACTGTTTTAC CCCAAGGGTT CAGGGATAGC 120 
3 0 CCCCATCTAT TTGGCCAGGC ATTAGCCCAA 150 



(2) INFORMATION FOR SEQ ID NO: 41 

3 5 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 11 amino acids 
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(B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 

5 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 41 

Cys lie Pro Val Arg Pro Asp Ser Gin Phe Leu 
15 10 



10 <2) INFORMATION FOR SEQ ID NO: 42 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 17 amino acids 

(B) TYPE : amino acid 

15 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 42 



20 Val Leu Pro Gin Gly Phe Arg Asp Ser Pro His Leu Phe Gly Glu Ala 
15 10 15 

Leu 
17 

25 

(2) INFORMATION FOR SEQ ID NO: 43 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8 amino acid 
3 0 (B) TYPE : amino acid 

(ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 43 

35 

Leu Phe Ala Phe Glu Asp Pro Leu 
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(2) INFORMATION FOR SEQ ID NO: 44 

5 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 8 amino acids 

(B) TYPE : amino acid 

10 (ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 44 

Phe Aia Phe Glu Asp Pro Leu Asn 
15 1 . 5 8 



(2) INFORMATION FOR SEQ ID NO: 45 

20 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 25 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

25 

(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 45 
3 0 GTGCTGATTG GTGTATTTAC AATCC 2 5 



(2) INFORMATION FOR SEQ ID NO: 46 

3 5 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1859 base pairs 
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(B) TYPE 



nucleic acid 



(C) STRANDEDNESS 



single 



( D ) TOPOLOGY 



: linear 



10 



15 



20 



25 



30 



(ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 46 

GTGCTGATTG GTGTATTTAC AATCCTTTAT CTAATCCGAA ATGCCCATGT TGCAATATGG 60 
AAAGAAAGGG AGTTCCTAAC CTCTGGGGGA ACCCCCATTA AATACCACAA GTAAATCATG 120 
GAGTTATTGC ACACAGTGCA AAAACTCAAG GAGGTGGAAG TCTTACACTG CCAAAGCCAT 180 
CAGAAAAGGG AAGAGGGGAG AAGAGCAGCA TAAGTGGCTA CAGAGGCAAG GAAAGACTAG 240 
CAGAAAGGAA AGAGAGAAAG AGACAGAAAG TCAGAGAGAG AGAGAGGAAG AGACAGAGCA 300 
CAAAGAGGGA GTCAGAGAGA GAGAGAGACA GAGAGTCAGA GAGAAGGAAA GAGAGAGAGG 360 
AAGAGACAAA GAATGAATCA AACAGAGAGA CAGAAAGTCA GAGAGAGAGA GAGAGAGGAA 420 
GAGACAGAGA AAAAGAGGGA GTCAGAAAAA GAGAGAGGAA AGAAGAAGTC CAAAGAGAAA 480 
GAAAGAGAGA TGGAAGTAGT AAAGGAAAAA CAGTGTACCC TATTCCTTTA AAAGCCGGGG 540 
TAAATTTAAA ACCTATAATT GATAACTGAA GGTCTTCTCT GTAACCCTGT AACACTCCAA 600 
TACCACCTTG TTGTCAAGTG TAAACAAGGG CGTAGCCCAA AAGCACTGAG GCCACTAACA 660 
ACCCATAGCC TTCCTATCAA AATTCCTTAA CCCAGCAGGT TTCCTAACAG GGGATCTAAA 720 
TCTTAATTAA TTACCATACA ATGGTCCAAC CAGACTTAGG AGGAATTCCC TTCAGGACGG 780 
GAAGATAGAT GCTTCCTCCC AGGCGATTAA GGGAGAAAGA CACAATGGGT ATTCAGTAAG 840 
TGCCAAGGGG AACACTTGTA GAAGCAAAGT TAGGAAAATT GCCAAATAAT TGGTTTGCTC 900 
AAGAGTTGTT TGCACTCAGC CAAACCTTGA AGTACTTGCA GAATCAGAAA GGAGCCATCT 960 
ATACCAATTC TAAGTTAATA TGGACTGAAG GAGGTTTTAT TAATACCAAA GAGAAATTAA 1020 
AATCCCAAAC TTATAAGGTT TTCAACCAAA GTAAAGTTTG CTAAAAGTTA ACAGCGTAAC 1080 
ATGTATTATC CTACTACCAC ACACTCTCAA AGGATTTCTC AGACAGTTTG CAAGAAATAA 1140 
TGATATCTAT CCTTACTCTA CAATCCCAAA TAGACTCTTT GGCAGCAGTG ACTCTCCAAA- 1200 
ACCGTCAAGG CCTAGACCTC CTCACTGCTG AGAAAGGAGG ACTCTGCACC TTCTTAAGGG 1260 
AAGAGTGTTG TCTTTACACT AACCAGTCAG GGATAGTATG AGATGCTGCC CGGCATTTAC 1320 
AGAAAAAGGC TTCTGAAATC AGACAACGCC TTTCAAATTC CTATACCAAC CTCTGGAGTT 1380 
GGGCAACATG GTTTCTTCCC TTTCTATGTC CCATGGCTGC CATCTTGCTA TTACTCGCCT 1440 
TTGGGCCCTG TATTTTTAAC CTCCTTGTCA AATTTGTTTC TTCTAGGATC GAGGCCATCA 1500 
AGCTACAGAT GGTCTTACAA ATGGAACCCC AAATGAGCTC AACTATCAAC TTCTACTGAG 1560 
GACCCCTAGA CCAACCCCCT GGCCCTTTCA CTGGCCTAAA GAGTTCCCCT CTGGAGGACA 1620 
CTACCACTGC AGGGCCCCAT CTTTGCCCCT ATCCAGAAGG AAGTAGCTAG AGCAGTCATT 1680 
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GCCCAATTCC CAAGAGCAGC TGGGGTGTCC CGTTTAGAGT GGGGATTGAG AGGTGAAGCC 1740 

AGCTGGACTT CTGGGTCGGG TGGGGACTTG GAGAACTTTT GTGTCTAGCT AAAGGATTGT 1800 

AAATGCAACA ATCAGTGCTC TGTGTCTAGC TAAAGGATTG TAAATACACC AATCAGCAC 1859 



(2) INFORMATION FOR SEQ ID NO: 47 

(i) SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH : 23 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

15 (ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 47 
TGATGTGAAC GGCATACTCA CTG 2 3 

20 

(2) INFORMATION FOR SEQ ID NO: 48 

(i) SEQUENCE CHARACTERISTICS : 
25 (A) LENGTH : 24 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

3 0 (ii) MOLECULE TYPE : cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 48 
CCCAGAGGTT AGGAACTCCC TTTC 24 

35 
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(2) INFORMATION FOR SEQ ID NO: 49 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 25 base pairs 

5 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : cDNA 

10 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 49 
GCTAAAGGAG ACTTGTGGTT GTCAG 2 5 

15 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CAACATGGGC ATTTCGGATT AG 2 2 

30 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

GGCTGCTAAA GGAGACTTGT GGTTGTCAGA CAATCGCCTA CTTAGGTACC AGGCCTTATT 60 

ACTTGAGGGA CTGGTGCTTC AGATGCGCAC TTGTGCAGCT CTTAACCCAA ACTTATGCTG 120 

CCCAGAAGGA TCTTTTAGAG GTCCCCTTAG CCAACCCTGA CCTCAACCTA TATATATACT 180 

10 GATGGAAGTT CGTTTGTAGA AAAGGGATTA CAAAGGGNAG GATATNCCAT AGGTTAGTGA 240 

TAAAGCAGTA CTTGAAAGTA AGCCTCTTCC CCCCAGGGAC CAGCGCCCCC GTTAGCAGAA 300 

CTAGTGGCAC TGACCCCGAG CCTTAGAACT TGGAAAGGGA GGAGGATAAA TGTGTATACA 360 

GATAGCAAGT ATGCTTATCT AATCCGAAAT GCCCATGTTG 400 

15 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2389 base pairs 

20 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

TCAGGGATAG CCCCCATCTA TTTGGTCAGG CACTGGCCCA AGATCTAGGG ACATGCCACT 60 

TTTAAGAGCC ATTTCTCAAG TCCAGGTACT CTGGTCCTTC GGTATGTGGA TGATTTACTT 120 

3 0 TTGGCTACCA GTTCAGTAGC CTCATGCCAG CAGGCTACTC TAGATCTCTT GAACTTTCTA 180 

GCTAATCAAG GGTACAAGGC ATCTAGGTTG AAGGCCCAGC TTTGCCTACA GCAGGTCAAA 240 

TATCTAGGCC TAATCTTAGC CAGAGGGACC AGGGCACTCA GCAAGGAACA AATACAGCCT 300 

ATACTGGCTT ATCCTCACCC TAAGACATTA AAACAGTTGC GGGGGTTCCT TGGAATCACT 360 

GGCTTTTTGG TGACTATGGA TTCCCAGATA CAGCAAGATT GGCAGGCCCC TCTATACTGT 420 

3 5 AATCAAGGAG ACTCACGAGG GCAAGTACTC ATCTAGTAGA ATGGGAACTA GGGACAGAAA 480 

CAGCCTTCAA AACCTTAAAG CAGGCCCTAG TACAATCTCC AGCTTTAAGC CTTCCCACAG 540 
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GACAAAACTT 


CTCTTTATAC 


ATCACAGAGA 


GGGCAGAGAT 


AGCTCTTGGT 


GTCCTTATTC 


600 




AGACTCATGG 


GACTACCCCA 


CAACCAGTGG 


CACACCTAAG 


TAAGGAAATT 


GATGTAGTAG 


660 




CAAAAGGCTG 


GCCTCACTGT 


TTATGGGTAG 


CTGTGGTGGT 


GGCTGTCTTA 


GTGTCAGAAG 


720 




CTATCAAAAT 


AATACAAGGA 


AAGGATCTCA 


CTGTCTGGAC 


TACTCATGAT 


GTAATGGCAT 


780 


5 


ACTAGGTGCC 


AAAAGAAGTT 


TATGGGTATC 


AGACAACCAC 


CTGCTTAGAT 


ACCAGGGACT 


840 




ACTCCTGGAG 


GATTGGGCTT 


CAAGTGCGTT 


TTTTGTGGCC 


TCAACCCTGC 


CACTTTTCCT 


900 




CCAGAGGATG 


GAGAGCCGCT 


TGAGCATGCT 


TGCCAACAGG 


TTGTAGGCCA 


GAATTATTCC 


960 




ACCCGAGATG 


ATCTCTTAGA 


GTACCCTTAG 


CTAATCCTGA 


CCTTAACCTA 


TATACCAATG 


1020 




GAAGTTCATT 


TGTGGAAAAC 


GGGATATGAA 


GGGCAGGTTA 


TGTCATAGTT 


AGTGATGTAA 


1080 


10 


TCATACTTGC 


AAGTAAGCCT 


CTTACCCCAG 


GGGCCAGCAC 


TCAGTTAGCA 


GAACTAGTCA 


1140 




CACTTACCTT 


AACCTTAGAA 


CTGGGAAAGG 


GAAAAAGAAT 


AAATATGTAT 


ACAGATAGTA 


1200 




AGTATGCTTA 


TCTAATCCTA 


CATGCCCATG 


CTGCAATATG 


GAAGGAAAGG 


GAGTTCCTAA 


1260 




CCCCTGGGGG 


AACCCCCATT 


AAATACCACA 


AGGYAAATCA 


TGGAGTTATT 


GCACGCAGTG 


1320 




CAAAAACTCA 


AGGAGGTGGC 


AGTCTTACAC 


TGCCGAAGCY 


ATCAAAAAGG 


GGAAGGAGAG 


1380 


15 


GGGAGAACAG 


CAGCATAAGT 


GGTTGGCAGA 


GGCAGTGAAA 


GACCAGCAGA 


GAGAAGGAGA 


1440 




GAGACAACGT 


CAACGACAGA 


AGGAAAGAAG 


AGGAGGAGAC 


AGAGAGGAAG 


AGACAGAGAG 


1500 




ACAGTTAGTC 


CAAGAGAGAG 


ACAGAGAGAG 


GAAGAGACAG 


ACAGAAAGTC 


CAAGAGAGAA 


1560 




GGAAAGAGAG 


GAAGAGACCA 


AGGAGTCCNA 


GAGAGAGAAA 


GAGATAGAAG 


TAGTAAAGAA 


1620 




AAAACATTGT 


ACCCTATTCC 


TTTAAAAGCC 


GGGGTATATT 


TAAAACCTAT 


AATTGATAAT 


1680 


20 


TGAGTTCTTG 


CACCCTCCTC 


CAGGGGATYG 


CTGGGAGGAA 


ACCCTCAACC 


GATATGTGAA 


1740 




AATTGTGGGT 


CGTCCCTATG 


TCTCAATTAC 


CAGCCAATAC 


CCCCTTGTTT 


TTAGTGTGAA 


1800 




CGAGGGTGTA 


GAGCGCAGAC 


AGGGAGACCT 


CTGACAATCC 


ATACCCTTCC 


TATCCAAAAT 


1860 




CCTTAACCCA 


GCAGGTTTTC 


TAAAAGGGGA 


TCTAAATCTT 


AATTAATTAC 


CATACAAAGG 


1920 




TCAAACCAGA 


TCTAGGAGGA 


ACTTCCTTCA 


GGACAGGATG 


ATAGATGGTT 


CCTCCCAGGC 


1980 


25 


GATTAAAGAA 


AATAAAAAGA 


CACATGGGCA 


GCCAGTAAGT 


GATAAGGGAA 


CACTAGTAGA 


2040 




AGCAGTTAGG 


AGAAGTTGCC 


TAATAATTGG 


TCTACTCCAA 


ATGTGTGAGT 


TGTTCGCACT 


2100 




CAGCCCAAAT 


CTTAAAGTAC 


TTACAGAATT 


AGGGAGGAGC 


CATTTACACC 


AATTCTAAGT 


2160 




TAATATGGAC 


TGGATGAGGT 


TTTATTAATA 


GCGAAGGAGA 


ATTAAATCCT 


AAACTNACAA 


2220 




GGTTTTCAAC 


TAAAGTAAAT 


TTTACTAAAA 


GCTAACAGTG 


TAACATGCAT 


TATCCTACTA 


2280 


30 


CAACACACTC 


TCANAGGATT 


CCTCAGACAG 


TTTACAAGAA 


ATAACAAAAT 


CTATCTGGTA 


2340 




AGGATAGTAA 


CTACAATCCC 


AAATACATTC 


TTTGGCAGCA 


GTGACTCTC 




2389 



(2) INFORMATION FOR SEQ ID NO: 53: 

35 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2448 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



10 


TCAGGGATAG 


CCCCCATCTA 


TTTGATCAGG 


CACTAGCCCA 


AGATCTAGGC 


CACTTCTGAA 


60 




GTCCAGGCAT 


TCTAGTCCTT 


CAGTATGTGG 


ATGATTTACT 


TTTGGCTACC 


AGTTTGGAAG 


120 




CCTCATGCCA 


GCAGGCTACT 


TGAGATCTCT 


TGAACTTTCT 


AGCTAATCAA 


GGGTGTATGG 


180 




CATCTAAATT 


GAAAGTCCAG 


CTCTGCCTAC 


AAGAAGTCAA 


ATATCTAGGC 


CTAATCTTAG 


240 




ATAGAAGAAC 


CAGGGCCCTC 


AGCAAGGAAT 


GAATAAAGCC 


TATGCTGGCT 


TATCGGCACC 


300 


15 


CTAAGACATT 


AAAACAATTG 


TGGGGGTTCC 


TTGGAATCAC 


TGGCTTTTGC 


CGACTATGGA 


360 




TCCCTGGATA 


GAGTGAGATA 


GCCAGGCCCC 


CTCTATTACT 


CTTATCAAGG 


AGACCCAGAG 


420 




GGCAAATACT 


TATCTAGTAT 


TATGGGNACC 


AGAGGCAGAA 


AAAGCCTTCC 


AAACCTTAAA 


480 




GGAGACCCTA 


GTACAAGCTC 


CAGCTTTAAG 


CCTTCCCACA 


GGACAAANCT 


TCTCTTTATA 


540 




TGTCACAGAG 


AGAGCAGGAA 


TAGCTCCTGG 


AGTCCTTACT 


CAGACTTTTG 


GACGACCCCA 


600 


20 


CGGCCAGTGG 


CRTACCTAAG 


TAAGGAAATT 


GATGTAGTAG 


CAAAAGGCTG 


GCCTCACTGT 


660 




TTATGGGTAG 


TTGCGGCTGT 


GGCAGTCTTA 


CTGTCAAAGG 


CTATCAAAAT 


AATACAAGGA 


720 




AAGGATTTCA 


CTATCTGGAC 


TACTCATGAG 


GAAAATGGCA 


TATTAGGTGC 


CAAAGGAAGT 


780 




TTTTGGCTAT 


CAGACAACCA 


CCTGCTCAGA 


TTCCAGGCAC 


TACTGATTGA 


GAGACCAGTG 


840 




CTTTAAATAT 


GTATGTGTGT 


GTGTGGCCCT 


CAACCCTGCC 


ACTGTTCTCC 


CAGAAGATGG 


900 


25 


AGAACCAATG 


AAGCATTACT 


GTCAACAAAT 


TAGAGTCCAG 


AGTTATGCTG 


CCTGAGAGGA 


960 




TCTCTTAGAA 


GTCCCCTTAG 


CTAATCCTGA 


CCTTAACCTA 


TATGCTGATG 


GAAGTTCACT 


1020 




TGTGGAGAAT 


GGGATACGAA 


AAGCACATTA 


TGCCATAGTT 


AGTGAGGTAA 


CAGTACTTGA 


1080 




AAGTAAGCCT 


ATTCCCCCAT 


GGACCAGAGC 


CCAGTTAGCA 


GAACTAGTGG 


CACTTACCCA 


1140 




AGCCTTAGAA 


CTAGGAAAGG 


GAAAAATAAT 


AAATGTGTAT 


ACAGATAGCA 


AGTATGCTTA 


1200 


30 


TCTAATCCTA 


CATGCCCATG 


CTGCAGTATG 


GAAAGAAAGG 


GAGTTCCTAA 


CCTCTGGGGG 


1260 




AACCCCCATT 


AAATACCACA 


AGGCAAATCA 


TGGAGTTATT 


GCATGTAGTG 


CAAAACCTCA 


1320 




AGTAGGTGGC 


AGTTTTACAC 


TGCCTGAAGC 


TATGGGGAAG 


GAGAGAGGAG 


AACAGCAGCA 


1380 




TAAGTGGCTA 


GCAGAGGCAG 


CGAAAGACTA 


GCAGAGAGGA 


GAGGTAGGGG 


AAAGACAGAA 


1440 




AGTCAAAGAA 


AAGAAGTCAA 


AGACAGACAG 


AGAAAGAGAC 


AGAGGGAGCC 


AGAGAGAAAG 


1500 


35 


AAAAGAGAGA 


ACGAAAGAGA 


CAGAATGTCA 


AAGAACAGAA 


GAGAGAGGCA 


GCGCCAGAAG 


1560 




AGTTAAGAAA 


GTGAGAAAGA 


GAGATGGAAA 


TAGTAAAGAA 


AAAACAGTGT 


ACCCTATTCC 


1620 
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TTTAAAAGCC 


AGGGTAAATT 


TAAAACGTAT 


AATTTTATAA 


TTGGAAGGTC 


TTCTCCATAA 


1680 


CCCTATAACA 


TTAAAATACC 


ACCTTGTTGT 


CAGTGTAAAC 


AAGAGCATAG 


CCCAAAAGCA 


1740 


CTGAGGCCAC 


TGACAACCCA 


TAGCCTTCCT 


ATCAAAAATC 


CTTAACTCTG 


CAGGTTTCCT 


1800 


AACAGGGGAT 


CTAAATCTCA 


ACTAATCACC 


ATACAATGGT 


CCGACCAGAC 


CTAGGAGCGA 


1860 


CTCCCCTCAG 


GACAGAAGGA 


TGGATGGTTC 


CTCCCAGGCC 


ATTAAGGGAA 


AGAGACACAA 


1920 


TGGGTATTCA 


GTAAGTGATA 


AGGGAACTCT 


TGTAGAAGCA 


GTTAGGAAGA 


TTGCCTAATA 


1980 


TTTGGTCTGC 


TCAAATGTGC 


CAGCTGTTTG 


CACTCAGCTA 


AACCTTAAAT 


TACTTACAGA 


2040 


ATTAGGAAGG 


AGCCATCTAT 


ACCAATTCTG 


AGTTAATATG 


AGCTGAACAA 


GTTCTTATTA 


2100 


ATAGCAAAGA 


ATCATTGAAA 


TCTCAAACTT 


GCAAAGTTTT 


CAACAAAAGT 


AAAGTTTGCT 


2160 


GAAAGTTAGC 


AGTGTAACAT 


GTATTATCCT 


AACTTCTAAT 


CTTGTGGAAA 


TCAGACCCTA 


2220 


TCAGTGCCCC 


TCAAAGCTGA 


AGTCCATCAG 


CATATGGCCA 


TACAACTAAT 


ACCCCTATTT 


2280 


ATAGGGTTAG 


GAATGGCCAC 


TGCTACAGGA 


ATGGGAGTAA 


CAGGTTTATC 


TACTTCATTA 


2340 


TCCTATTACC 


ACACACTCTT 


AAAGGATTTC 


TCAGACAGTT 


TACAAGAAAT 


AACAAAATCT 


2400 


ATCCTTACTC 


TNTARTCCCA 


AATAGRTTCT 


TTGGCAGCAG 


TGACTCTC 




2448 



15 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

2 5 (11) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CCTGAGTTCT TGCACTAACC C 21 

30 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 23 base pairs 

(B) TYPE; nucleotide 



BNSDOCiD <WO 9823755A1> 



m 



WO 98/23755 PCT/IB97/01482 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
GTCCGTTGGG TTTCCTTACT CCT 2 3 



10 



20 



(2) INFORMATION FOR SEQ ID NO: 56: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1196 base pairs 
15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 





TTCCTGAGTT 


CTTGCACTAA 


CCTCAAATGA 


GAGAAGTGCC 


GCCATAACTG 


CAACCCAAGA 


60 




GTTTGGCGAT 


CCCTGGTATC 


TCAGTCAGGT 


CAATGACAGG 


ATGACAACAG 


AGGAAAGATA 


120 


25 


ATGATTCCCC 


ACAGGCCAGC 


AGGCAGTTCC 


CAGTGTAGAC 


CCTCATTAGG 


ACACAGAATC 


180 




AGAACATGGA 


GATTGGTGCC 


GCAGACATTT 


GCTAACTTGC 


GTGCTAGAAG 


GACTAAGGAA 


240 




AACTAGGAAG 


ATATGAATTA 


TTCAATGATG 


TCCACTATAA 


CACAGGGGAA 


AGGAAGAAAA 


300 




TCCTACTGCC 


TTTCTGGAGA 


GACTAAGGGA 


GGCATTGAGG 


AAGCATACCA 


GGCAAGTGGA 


360 




CATTGGAGGC 


TCTGGAAAAG 


GGAAAAGTTG 


GGAAAAGTAT 


ATGTCTAATA 


GGGCTTGCTT 


420 


30 


CCAGTGTGGT 


CTACAAGGAC 


ACTTTAAAAA 


AGATTGTCCA 


ATAGAAATAA 


GCCACCACCT 


480 




CGTCCATGCC 


CCTTATGTCA 


AGGGAATCAC 


TGGAAGGCCC 


ACTGCCCCAG 


GGGATGAAGG 


540 




TCCTCTGAGT 


CAGAAGCCAC 


TAACCAGATG 


ATCCAGCAGC 


AGGACTGAGG 


GTGCCCGGGG 


600 




CAAGCGCCAG 


CCCATGCCAT 


CACCCTCACA 


GAGCCCCAGG 


TATGCTTGAC 


CATTGAGGGT 


660 




CAGAAGGGTA 


CTGTCTCCTG 


GACACTGGCG 


GGCCTTCTCA 


GTCTTACTTT 


CCTGTCCTGG 


720 


35 


ACAACTGTCC 


TCCAGATCTG 


TCACTGTCCG 


AGGGGTCCTA 


GGACAGCCAG 


TCACTAGATA 


780 




CTTCTCCCAG 


CCACTAAGTT 


GTGACTGGGG 


AACTTTACTC 


TTCCACATGC 


TTTTCTAATT 


840 



8NSDOCID:<WO 9823755A1> 



wo 98/23755 



ATGCCTGAAA GCCCCACTCT CTTGTTAGGG 
TATACATGTG AATATAGGAG AAGGAACAAC 
TAATCCTGAA GTCCGGGCAA CAGAAGGACA 
TCAAGTTAAA CTAAAGGATT CCACCTCCTT 
5 CGAGACCCAA CAAGAACTCC AAAAGATTGT 
ACCAAGCAAT AGCCCTTGCA AGACTCCAAT 



152 

GAGAGACATT CTAGCAAAAG CAGGGGCCAT 900 

TGTTTGTTGT CCCCTGCTTG AGGAAGGAAT 960 

ATATGGACAA GCAAAGAATG CCCGTCCTGT 1020 

TCCCTACCAA AGGCAGTACC CCCTCAGACC 1080 

AAAGGACCTA AAAGCCCAAG GCCTAGTAAA 1140 

TTTAGGAGTA AGGAAACCCA ACGGAC 1196 



(2) INFORMATION FOR SEQ ID NO: 57: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2391 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57; 

20 



ATGATCCAGC 


AGCAGGACNG 


AGGGTGCCCG 


GGGCAAGCGC 


CAGCCCATGC 


CATCACCCTC 


60 


ACAGAGCCCC 


AGGTATGCTT 


GACCATTGAG 


GGTCAGAAGG 


GTNACTGTCT 


CCTGGACACT 


120 


GGCGGNGCCT 


TCTCAGTCTT 


ACTTTCCTGT 


CCTGGACAAC 


TGTCCTCCAG 


ATCTGTCACT 


180 


GTCCGAGGGG 


TCCTAGGACA 


GCCAGTCACT 


AGATACTTCT 


CCCAGCCACT 


AAGTTGTGAC 


240 


TGGGGAACTT 


TACTCTTCCC 


ACATGCTTTT 


CTAATTATGC 


CTGAAAGCCC 


CACTCTCTTG 


300 


TTGGGGAGAG 


ACATTCTAGC 


AAAAGCAGGG 


GCCATTATAC 


ATGTGAATAT 


AGGAGAAGGA 


360 


ACAACTGTTT 


GTTGTCCCCT 


GCTTGAGGAA 


GGAATTAATC 


CTGAAGTCCG 


GGCAACAGAA 


420 


GGACAATATG 


GACAAGCAAA 


GAATGCCCGT 


CCTGTTCAAG 


TTAAACTAAA 


GGATTCCACC 


480 


TCCTTTCCCT 


ACCAAAGGCA 


GTACCCCCTC 


AGACCCGAGA 


CCCAACAAGA 


ACTCCAAAAG 


540 


ATTGTAAAGG 


ACCTAAAAGC 


CCAAGGCCTA 


GTAAAACCAA 


GCAATAGCCC 


TTGCAAGACT 


600 


CCAATTTTAG 


GAGTAAGGAA 


ACCCAACGGA 


CAGTGGAGGT 


TAGTGCAAGA 


ACTCAGGATT 


660 


ATCAATGAGG 


CTGTTGTTCC 


TCTATACCCA 


GCTGTACCTA 


ACCCTTATAC 


AGTGCTTTCC 


720 


CAAATACCAG 


AGGAAGCAGA 


GTGGTTTACA 


GTCCTGGACC 


TTAAGGATGC 


CTTTTTCTGC 


780 


ATCCCTGTAC 


GTCCTGACTC 


TCAATTCTTG 


TTTGCCTTTG 


AAGATCCTTT 


GAACCCAACG 


840 


TCTCAACTCA 


CCTGGACTGT 


TTTACCCCAA 


GGGTTCAGGG 


ATAGCCCCCA 


TCTATTTGGC 


900 


CAGGCATTAG 


CCCAAGACTT 


GAGTCAATTC 


TCATACCTGG 


ACACTCTTGT 


CCTTCAGTAC 


960 
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ATGGATGATT 


TACTTTTAGT 


CGCCCGTTCA 


GAAACCTTGT 


GCCATCAAGC 


CACCCAAGAA 


1020 


CTCTTAACTT 


TCCTCACTAC 


CTGTGGCTAC 


AAGGTTTCCA 


AACCAAAGGC 


TCGGCTCTGC 


1080 


TCACAGGAGA 


TTAGATACTN 


AGGGCTAAAA 


TTATCCAAAG 


GCACCAGGGC 


CCTCAGTGAG 


1140 


GAACGTATCC 


AGCCTATACT 


GGCTTATCCT 


CATCCCAAAA 


CCCTAAAGCA 


ACTAAGAGGG 


1200 


TTCCTTGGCA 


TAACAGGTTT 


CTGCCGAAAA 


CAGATTCCCA 


GGTACASCCC 


AATAGCCAGA 


1260 


CCATTATATA 


CACTAATTAN 


GGAAACTCAG 


AAAGCCAATA 


CCTATTTAGT 


AAGATGGACA 


1320 


CCTACAGAAG 


TGGCTTTCCA 


GGCCCTAAAG 


AAGGCCCTAA 


CCCAAGCCCC 


AGTGTTCAGC 


1380 


TTGCCAACAG 


GGCAAGATTT 


TTCTTTATAT 


GCCACAGAAA 


AAACAGGAAT 


AGCTCTAGGA 


1440 


GTCCTTACGC 


AGGTCTCAGG 


GATGAGCTTG 


CAACCCGTGG 


TATACCTGAG 


TAAGGAAATT 


1500 


GATGTAGTGG 


CAAAGGGTTG 


GCCTCATNGT 


TTATGGGTAA 


TGGNGGCAGT 


AGCAGTCTNA 


1560 


GTATCTGAAG 


CAGTTAAAAT 


AATACAGGGA 


AGAGATCTTN 


CTGTGTGGAC 


ATCTCATGAT 


1620 


GTGAACGGCA 


TACTCACTGC 


TAAAGGAGAC 


TTGTGGTTGT 


CAGACAACCA 


TTTACTTAAN 


1680 


TATCAGGCTC 


TATTACTTGA 


AGAGCCAGTG 


CTGNGACTGC 


GCACTTGTGC 


AACTCTTAAA 


1740 


CCCAAACTTA 


TGCTGCCCAG 


AAGGATCTTT 


NTAGAGGTCC 


CCTTAGCCAA 


CCCTGACCTC 


1800 


AACTATATAT 


ATACTGATGG 


AAGTTCGTTT 


GTAGAAAAGG 


GATTACAAAG 


GGNAGGATAT 


1860 


NCCATAGGTG 


TTAGTGATAA 


AGCAGTACTT 


GAAAGTAAGC 


CTCTTCCCCC 


CCAGGGACCA 


1920 


GCGCCCCCGT 


TAGCAGAACT 


AGTGGCACTG 


ACCCCGCGAG 


CCTTAGAACT 


TTGGAAAGGG 


1980 


AGGAGGATAA 


ATGTGTATAC 


AGATAGCAAG 


TATGCTTATC 


TAATCCGAAA 


TGCCCATGTT 


2040 


GTTTATCTAA 


TCCGAAATGC 


CCATGTTGCA 


ATATGGAAAG 


AAAGGGAGTT 


CCTAACCTCT 


2100 


GGGGGAACCC 


CCATTAAATA 


CCAC7\AGTTA 


ATCATGGAGT 


TATTGCACAC 


AGTGCAAAAA 


2160 


CTCAAGGAGG 


TGGAAGTCTT 


ACACTGCCAA 


AGCCATCAGA 


AAAGGGAAAG 


GGGAGAAGAG 


2220 


CAGCATAAGT 


GGCTACAGAG 


GCAAGGAAAG 


ACTAGCAGAA 


AGGAAAGAGA 


GAAAGAGACA 


2280 


GAAAGTCAGA 


GAGAGAGAGA 


GGAAGAGACA 


GAGCACAAAG 


AGGGAGTCAG 


AGAGAGAGAG 


2340 


AGACAGAGAG 


TCAGAGAGAA 


GGAAAGAGAG 


AGAGGAAGAG 


ACAAAGAATG 


A 


2391 



25 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1722 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

3 5 (ii) MOLECULE TYPE: cDNA 



BNSDOClD:<WO 9823755A1> 



wo 98/23755 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 





TGGAGAATAG 


CAGCATAAGT 


TGGCTGGCAG 


AAGTAGGGAA 


AGACAGCAAG 


AAGTAAAGAA 


60 




AAAAARGAGA 


AAGTCAGAGA 


AAGAAAAAAA 


GAGAGGAAGA 


AACAAAGAAG 


AACTTGAAGA 


120 


5 


GAGAAAGAAG 


TAGTAAAGAA 


AAAACAGTAT 


ACCCTATTCC 


TTTAAAAGCC 


AGGGTAAATT 


180 




TCTGTCTACC 


TAGCCAAGGC 


ATATTCTTCT 


TATGTGGAAC 


ATCAACCTAT 


ATCTGCCTCC 


240 




CCACTAACTG 


GACAGGCACC 


TGAACCTTAG 


TCTTTCTAAG 


TCCCAACATT 


AACATTGCCC 


300 




CAGGAAATCA 


GACCCTATTG 


GTACCTGTCA 


AAGCTAAAGT 


CCCGTCAGTG 


CAGAGCCATA 


360 




CAACTAATAT 


CCCTATTTAT 


AGGGTTAGGA 


ATGGCTACTG 


CTACAGGAAC 


TGGAATAGCC 


420 


10 


GGTTTATCTA 


CTTCATTATC 


CTACTACCAT 


ACACTCTCAA 


AGAATTTCTC 


AGACAGTTTG 


480 




CAAGAAATAA 


TGAAATCTAT 


TCTTACTTTA 


CAATCCCAAT 


TAGACTCTTT 


GGCAGCAATG 


540 




ACTCTCCAAA 


ACCGCCGAGG 


CCCACACCTC 


CTCACTGCTG 


AGAAAGGAGG 


ACTCTGCACC 


600 




TTCTTAGGGG 


AAGAGTGTTG 


TTTTTACACT 


AACCAGTCAG 


GGATAGTACG 


AGATGCCACC 


660 




TGGCATTTAC 


AGGAAAGGGC 


TTCTGATATC 


AGACAATGCC 


TTTCAAACTC 


TTATACCAAC 


720 


15 


CTCTGGAGTT 


GGGCAACATG 


GCTTCTTCCA 


TTTCTAGGTC 


CCATGGCAGC 


CATCTTGCTG 


780 




TTACTCACCT 


TTGGGCCCTG 


TATTTTTAAG 


CTTCTTGTCA 


AATTTGTTTC 


CTCTAGGATC 


840 




GAAGCCATCA 


AGCTACAGAT 


GGTCTTACAA 


ATGGAACCCC 


AAATGAGTTC 


AACTAACAAC 


900 




TTCTACCAAG 


GACCCCTGGA 


ACGATCCACT 


GGCACTTCCA 


CTAGCCTAGA 


GATTCCCCTC 


960 




TGGAAGACAC 


TACAACTGCA 


GGGCCCCTTC 


TTTGCCCCTA 


TCCAGCAGGA 


AGTAGCTAGA 


1020 


20 


GCGGTCATCG 


GCCAAATTCC 


CAACAGCAGT 


TGGGGTGTCC 


TGTTTAGAGG 


GGGGATTGAA 


1080 




GAGGTGACAG 


CCTGCTGGCA 


GCCTCACAGC 


CCTCGTTGGY 


TCTCAGTGCC 


TCCTCAGCCT 


1140 




TGGTGCCCAC 


TCTGGCCGTG 


CTTGAGGAGC 


CCTTCAGCCT 


GCCACTGCAC 


TGTGGGAGCC 


1200 




TCTTTCTGGG 


CTGGACAAGG 


CCGGAGCCAG 


CTCCCTCAGC 


TTGCAGGGAG 


GTATGGAGGG 


1260 




AGAGATGCAG 


GCGGGAACCA 


GGGCTGCGCA 


TGGCGCTTGC 


GGGCCAGCAT 


GAGTTCCAGG 


1320 


25 


TGGGCGTGGG 


CTCGGCGGGC 


CCCACACTCG 


GGCAGTGAGG 


GGCTTAGCAC 


CTGGGCCAGA 


1380 




CAGATGCTGT 


GCTCAACTTC 


TTCGCTGGGC 


CTTAGCTGCC 


TTCCCCGTGG 


GGCAGGGCTY 


1440 




CGGGAACMTG 


CAGCCTGCCC 


ATGCTTGAGC 


CCCCCACCCC 


GCCGTGGGTT 


CYTGCACAGC 


1500 




CCAAGCTTCC 


CGGACAAGCA 


CCACCCCTTA 


TCCACGGTGC 


CCAGTCCCAT 


CAACCACCCA 


1560 




AGGGTTGAGG 


AGTGCGGGCA 


CACAGCGCGG 


GATTGGCAGG 


CAGTTCCACT 


TGCGGCCTTG 


1620 


30 


GTGCGGGATC 


CACTGCGTGA 


AGCCAGCTGG 


GCTCCTGAGT 


CTGGTGGGGA 


CTTGGAGAAT 


1680 




CTTTATGTCT 


AGCTAAGGGA 


TTGTAAATAC 


ACCAATCAGC 


AC 




1722 



(2) INFORMATION FOR SEQ ID NO: 59: 

35 

(i) SEQUENCE CHARACTERISTICS: 



BNSOOCID- <WO 98237S5A1> 
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(A) LENGTH: 495 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



15 



20 



30 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



CTTCCCCAAC 


TAATAAGGAC 


CCCCCTTTCA 


ACCCAAACAG 


TCCAAAAGGA 


CATAGACAAA 


60 


GGAGTAAACA 


ATGAACCAAA 


GAGTGCCAAT 


ATTCCCTGGT 


TATGCACCCT 


CCAAGCGGTG 


120 


GGAGAAGAAT 


TCGGCCCAGC 


CAGAGTGCAT 


GTACCTTTTT 


CTCTCTCACA 


CTTGAAGCAA 


180 


ATTAAAATAG 


ACNTAGGTNA 


ATTNTCAGAT 


AGCCCTGATG 


GYTATATTGA 


TGTTTTACAA 


240 


GGATTAGGAC 


AATCCTTTGA 


TCTGACATGG 


AGAGATATAA 


TATTACTGCT 


AAATCAGACG 


300 


CTAACCTCAA 


ATGAGAGAAG 


TGCTGCCATA 


ACTGGAGCCC 


GAGAGTTTGG 


CAATCTCTGG 


360 


TATCTCAGTC 


AGGTCAATGA 


TAGGATGACA 


ACGGAGGAAA 


GAGAACGATT 


CCCCACAGGG 


420 


CAGCAGGCAG 


TTCCCAGTGT 


AGCTCCTCAT 


TGGGACACAG 


AATCAGAACA 


TGGAGATTGG 


480 


TGCCGCAGAC 


ATTTA 










495 



(2) INFORMATION FOR SEQ ID NO: 60: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2503 base pairs 
25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 



CCAAGAACCC ACCAATTCCG GANCACATTT TGGCGACCAC GAAGGGACTT TCGCATATCG 60 

CCAAGCGGTG AGACAATAGC CGAGCGGTGA GACCTTTCCC AATCGCCAAG CAGTGAGTAC 120 

3 5 CATCAGACCC CTTTCACTTG CTATTCTGTC CTATCTTTCT TTAGAATTCG GGGGCTAAAT 180 

ACCGGGCATC TGTCAGCCAT TTAAAAGTGA CTAGCGGGCC GCCGGACTAA AGACACGGGT 240 
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GTCAAGCTTT CTGGGAAAGG GCTCTCTAAC 
GGTTTGCCTA GAACCAGCTT CCGCTTTTCC 
GTGAAGGAAA GCCATGCATC TCCGGGGTCT 
GAGTGGAACT CTCAAAAGCA TGTCGCCCAA 
5 TGACCCTTGC CCTCTGGGTC CTAATGCCTG 
TGAAGCTAGA ACCGCTTCTA AAAATTGCTA 
CTATAAAGAA TGAWTTCTAG TATTAAACTC 
GGCTCACCAA TCAGAAAGAC ACAGTTTTTG 
GGAATTTTAG GATCCCTCCT CAGACTAACA 

10 ATATGGGGAG CCTCAGAAAT TGTATCCCTC 
ACTCTTCCAA CCCTGAAGAT CCCCTCCCTC 
GTGGCATAAC ATCTTTATAG GATGGGGTAA 
ACTCTAACAG GTTTTTGAGA ATGCGTCAGT 
GGTCCTCCTT GTGGTCTAGG AGGACAGGCA 

15 TAAGGACCAC TAAATCCGAC CTTCCTCGGT 
TTTCTGCTGC TGCGTCGGTG AGCGCAACTA 
AGGTTCTTGG GCAGGGGTTG TTTCTGCTGC 
CAGGGTCCCA GGACCATTGC AGGTCCTTGG 
GTGGGCGGTT TTGTCTTTCA TATGGGAAAC 

20 TGCATCCTAA GCCATTGGGA CCAATTTGAC 
TTTTCCTGCA CTACGGCTTG GCCCCAATAT 
GAGGGAAGCA CAAATTACAA TAYTATCCTA 
AAATGGAGTG AATACCTTAT GTCCAAGCTT 
GCAAAGCTTG CAATTTACAT CCCACAGGAG 

2 5 TCCCTATAGC TTCCCTTCCT ATTGATGATA 

AAATAAGCAA AGAAATCTCC AAAGGTCCAC 
TCAAGYTGTA GGGGGAGGGG AATTTGGCCC 
GATTTAAAGC AGATCAAGGC AGACCTGGGG 
GATGTCCTAC AGGGTCTAGG GCAAACCTTT 

3 0 TTAGATCAAA CCCTGGCCTT TAATGAAAAG 

GGAGATACCT GGTATCCTAG TCAAGTAAAT 
TTCCTTACTG GTCAGCAACC CATCCCCAGT 
CATGGGGACT GGAGTCGTAA ACATCTGTTG 
GGGAAAAAGC CCATGAATTA TTCAATGATA 
3 5 CCTTCTGCCT TCCTCGAGCG GCTACAAGAG 
GAATCACTCG AGGGTCAATT GATTCTAAAA 
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AACCCCCAAC 


TCTTTGGAGT 


TGGGACCGTT 


300 


TGTACTTCTG 


GGCTGAGCCG 


TGGGTTGACA 


360 


CGMCAACATG 


TTGGTTGACC 


CTGCGGCCAT 


420 


GCGACACTCG 


CCTATCTATC 


CTATCTATCC 


480 


CCAGACAAAC 


TTCCTCTCGC 


CTCTCTTCTC 


540 


CCTGGTCTCT 


GGTGCTTTTC 


CTARTTTCTC 


600 


CAGGACTCTG 


TTACCTTCTT 


TAGGCACCCG 


660 


CCCAAGGCCC 


CATCGTAGTG 


GGGACTACCT 


720 


GGCCTAACAA 


AAGTTATTCC 


TGAAGCTAGG 


780 


CTATTCATAT 


AAGTGAGAAC 


AAAAGGTGTC 


840 


CCTCAGGGTA 


TGGCCCTCCA 


TTTCATTTTT 


900 


AGTCCCAATA 


CTAACAGGAG 


AATGCTTAGG 


960 


AAGGGCCACT 


AAATCTGATT 


TTTCTCAGTC 


1020 


AGGTTGTGCA 


GGTTTTCGAG 


AATGCGTCAG 


1080 


CCTCCATGTG 


GTCTGGGAGG 


AAAACTAGTG 


1140 


TTCAAGTCAG 


CAGGGTCCAG 


GGACCGTTGC 


1200 


TGCATTGGTG 


AATGCAACTA 


TTCTGATCAG 


1260 


GCAGGGAGAG 


AAACAAAACA 


AACCAAAACT 


1320 


ACTCAGGCAT 


CAACAGGTTC 


ACCCTTGAAA 


1380 


CCACAAACCC 


TGAAAAAGAG 


GAGGCTCATT 


1440 


TCTCTTTYTG 


ATGGGGAAAA 


ATGGCCACCT 


1500 


CAGCYTGATC 


TTTTCTGTAA 


GAGGGAAGGC 


1560 


TCTTTTCATT 


GAGGGAGAAT 


ACACAACTAT 


1620 


GACCCTTCAG 


CTTACCCCCA 


TATCCTAGCC 


1680 


CTCCTCCTCT 


AATCTCCCCT 


GCCCAGAAGG 


1740 


AAAAACCCCC 


GGGCTATCGG 


TTATGTCCCT 


1800 


AACCCGGGTG 


CATGTCCCTT 


CTCCCTCTCT 


1860 


AAGTTTTCAG 


ATGATCCTGA 


TAGGTACATA 


1920 


GACCTCACTT 


GGAGAGACGT 


CATGCTACTG 


1980 


AATGCGGCTT 


TAGCTGCAGC 


CTGAGAGTTT 


2040 


GAAAGAATGA 


CAGCCGAAGA 


AAGGGACAAC 


2100 


ATGGATCCCC 


ACTGGGACTT 


TGACTCAGAT 


2160 


ATCTGTGTTC 


TGGAAGGACT 


AAGGAGAATT 


2220 


TCCACCATAA 


CCCAGGGAAA 


GGAAGAAAAT 


2280 


GCCTTAAGAA 


AATATACTCC 


CCTGTCACCC 


2340 


GATAAGTTTA 


TTACCCAATC 


AGCCACAGAT 


2400 
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ATCAGGAGAA AGCTCCAAAA GCAAGCCCTG 
ACCTGGCAAC CTTGGTGTTC TATAATAGGG 



157 

AGCCTGAACA AAATCTAGAG ACATTATTAA 2460 
ACCAAGAGGA AC A 2 503 



5 (2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1167 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



AAGGTUVACTC 


AGAAAGCCAA 


TACCCATTTA 


GTAAGATGGA 


CACCAGAAGC 


AGAAGCAGCT 


60 


TTCCAGGCCC 


TAAAGAAATC 


CCTAACCCAA 


GCCCCAGTGT 


TAAGCTTGCC 


AACGGGGCAA 


120 


GACTTTTCTT 


TATATGTCAC 


AGAAAAACAG 


GAATAGCTCT 


AGGAGTCCTT 


ACACAGGTCC 


180 


AAGGGACAAG 


CTTGCAACCT 


GTGGCATACC 


TGAGTAAGGA 


AACTGATGTA 


NTGGCAAAGG 


240 


GTTGGCCTCA 


TTGTTTACAG 


GTAGGGCAGC 


AGTAGCAGTC 


TTAGTTTCTG 


AAACAGTTAA 


300 


AATAATACAG 


GGAAGAGATC 


TTACTGTGTG 


GACATCTCAT 


GATGTGAACG 


GCATACTCAC 


360 


TGCTAAAGAG 


GACTTGTGGC 


TGTCAGACAA 


CCATTTACTT 


AAATAGCAGG 


TTCTATTACT 


420 


TGAAGTGCCA 


GTGCTGCGAC 


TGCACATTTG 


TGCAACTCTT 


AACCCAGCCA 


CATTTCTTCC 


480 


AGACAATGAA 


GAAAAGATAG 


AACATAACTG 


TCAACAAGTA 


ATTGCTCAAA 


CCTATGCTGC 


540 


TCGAGGGGAC 


CTTCTAGAGG 


TTCCCTTGAC 


TGATCCCGAC 


CTCAACTTGT 


ATACTGATGG 


600 


AAGTTCCTTG 


GCAGAAAAAG 


GACTTTGAAA 


AGCGGGGTAT 


GCAGTGATCA 


GTGAT7VATGG 


660 


AATACTTGAA 


AGTAATCGCC 


TCACTCCAGG 


AACTAGTGCT 


CACCTGGCAG 


AACTAATAGC 


720 


CCTCACTTGG 


GCACTAGAAT 


TAGGAGAAGG 


AAAAAGGGTA 


AATATATATT 


CAGACTCTAA 


780 


GTATGCTTAC 


CTAGTCCTCC 


ATGCCCATGC 


AGCAATATGG 


AGAGAGAGGG 


AATTCCTAAC 


840 


TTCTGAGGGA 


ACACCTATCA 


ACCATCAGGG 


AAGCCATTAG 


GAGATTATTA 


TTGGCTGTAC 


900 


AGAAACCTAA 


AGAGGTGGCA 


GTCTTACACT 


GCCAGGGTCA 


TCAGGAAGAA 


GAGGAAAGGG 


960 


AAATAGAAGG 


CAATCGCCAA 


GCGGATATTG 


AAGCAAAAAA 


AGCCGCAAGG 


CAGGACTCTC 


1020 


CATTAGAAAT 


GCTTATAGAA 


GGACCCCTAG 


TATGGGGTAA 


TCCCCTCTGG 


GAAACCAAGC 


1080 


CCCAGTACTC 


AGCAGGAAAA 


ATAGAATAGG 


AAACCTCACA 


AGGACATACT 


TTCCTCCCCT 


1140 


CCAGATGGCT 


AGCCACTGAG 


GAAGGAA 








1167 
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(2) INFORMATION FOR SEQ ID NO: 62: 

! 

5 (i) SEQUENCE CHARACTERISTICS: \ 

I 

(A) LENGTH: 78 base pairs | 

(B) TYPE: nucleotide i 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

15 TCCAAAGGCA CCAGGGCCCT CAGTGAGGAA CGTATCCAGC CTATACTGGC TTATCCTCAT 60 
CCCAAAACCC TAAAGCAA 78 



(2) INFORMATION FOR SEQ ID NO: 63 

20 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 26 amino acids 

(B) TYPE : amino acid 



25 (ii) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 63 

Ser Lys Gly Thr Arg Ala Leu Ser GIu Glu Arg He Gin Pro He Leu 

30 1 5 10 15 

Ala Tyr Pro His Pro Lys Thr Leu Lys Gin 
20 25 



3 5 (2) INFORMATION FOR SEQ ID NO: 64: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
AAATGTCTGC GGCACCAATC TCCATGTT 2 8 



(2) INFORMATION FOR SEQ ID NO: 65: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
2 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

25 

AAGGGGCATG GACGAGGTGG TGGCTTATTT 30 



(2) INFORMATION FOR SEQ ID NO: 66: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 5 (D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GGAGAAGAGC AGCATAAGTG G 21 

5 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
GTGCTGATTG GTGTATTTAC AATCC 2 5 

20 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

25 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
GACTCGCTGC AGATCGATTT ^tTTTTTTTT TTTT 34 



3 5 (2) INFORMATION FOR SEQ ID NO: 69: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

10 

GCCATCAAGC CACCCAAGAA CTCTTAACTT 30 



(2) INFORMATION FOR SEQ ID NO: 70: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
2 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
2 5 CCAATAGCCA GACCATTATA TACACTAATT 30 



(2) INFORMATION FOR SEQ ID NO: 71: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
GCCATAACTG CAACCCAAGA GTT 2 3 

5 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

10 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GGACGAGGTG GTGGCTTATT TCT 2 3 

20 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

25 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
AACTTGCGTG CTAGAAGGAC TAAGG 2 5 

35 

(2) INFORMATION FOR SEQ ID NO: 74: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

AACTTTTCCC TTTTCCAGAT CCTC 24 

15 (2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

25 (xi) SEQUENCE DESCRIPTION; SEQ ID NO: 75: 

GCATACCAGG CAAGTGGACA TT 22 

3 0 (2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

3 5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

5 

CTGTCCGTTG GGTTTCCTTA CTCCT 2 5 

(2) INFORMATION FOR SEQ ID NO: 77: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

<C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

20 

GAGGCTCTGG AAAAGGGAAA AGTT 24 
(2) INFORMATION FOR SEQ ID NO: 78: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

35 

CTGTCCGTTG GGTTTCCTTA CTCCT 25 
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(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

15 AGGAGTAAGG AAACCCAACG GACAG 2 5 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 5 (ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
TGTATATAAT GGTCTGGCTA TTGGG 2 5 

30 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
AGGAGTAAGG AAACCCAACG GACAG 25 

10 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

15 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: cDNA 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
TTCGGCAGAA ACCTGTTATG CCAAGG 2 6 

25 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
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CTCGATTTCT TGCTGGGCCT TA 



22 



5 (2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



GTTGATTCCC TCCTCAAGCA 



20 



2 0 (2) INFORMATION FOR SEQ ID NO: 8S : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

25 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 



CTCTACCAAT CAGCATGTGG 



20 



3 5 (2) INFORMATION FOR SEQ ID NO: 86: 



:<WO 9823755A1> 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
TGTTCCTCTT GGTCCCTAT 19 



(2) INFORMATION FOR SEQ ID NO: 87: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 aminoacids 

(B) TYPE: aminoacid 

2 0 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Met Ala Thr Ala Thr Gly Thr Gly lie Ala Gly Leu Ser Thr Ser Leu 
15 10 15 

25 Ser Tyr Tyr His Thr Leu Ser Lys Asn Phe Ser Asp Ser Leu Gin Glu 

20 25 30 

lie Met Lys Ser lie Leu Thr Leu Gin Ser Gin Leu Asp Ser Leu Ala 

35 40 45 

Ala Met Thr Leu Gin Asn Arg Arg Gly Pro His Leu Leu Thr Ala Glu 

30 50 55 60 

Lys Gly Gly Leu Cys Thr Phe Leu Gly Glu Glu Cys Cys Phe Tyr Thr 
65 70 75 80 

Asn Gin Ser Gly lie Val Arg Asp Ala Thr Trp His Leu Gin Glu Arg 
85 90 95 

35 Ala Ser Asp lie Arg Gin Cys Leu Ser Asn Ser Tyr Thr Asn Leu Trp 

100 105 110 
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Ser 
Leu 

5 Phe 
145 
Met 

Glu 

10 

Thr 
Ala 

15 Phe 
225 
Pro 

Cys 

20 

Trp 
Gly 

25 Ala 
305 
Gly 

Ser 

30 

Xaa 
His 

35 Ser 
385 



Trp Ala Thr 
115 

Leu Leu Leu 
130 

Val Ser Ser 

Glu Pro Gin 

Arg Ser Thr 
180 

Leu Gin Leu 
195 

Arg Ala Val 
210 

Arg Gly Gly 

Arg Trp Xaa 

Leu Arg Ser 
260 

Ala Gly Gin 
275 

Gly Arg Asp 
290 

Ser Met Ser 

Ser Glu Gly 

Ser Leu Gly 
340 

Ala Ala Cys 
355 

Ser Pro Ser 
370 

Pro lie Asn 



Trp Leu Leu 

Thr Phe Gly 
135 

Arg lie Glu 
150 

Met Ser Ser 
165 

Gly Thr Ser 

Gin Gly Pro 

lie Gly Gin 
215 

lie Glu Glu 

230 
Ser Val Pro 
245 

Pro Ser Ala 

Gly Arg Ser 

Ala Gly Gly 
295 

Ser Arg Trp 

310 
Leu Ser Thr 
325 

Leu Ser Cys 

Pro Cys Leu 

Phe Pro Asp 
375 

His Pro Arg 
390 



169 

Pro Phe Leu 
120 

Pro Cys lie 

Ala lie Lys 

Thr Asn Asn 
170 

Thr Ser Leu 
185 

Phe Phe Ala 
200 

lie Pro Asn 

Val Thr Ala 

Pro Gin Pro 
250 

Cys His Cys 

265 
Gin Leu Pro 
280 

Asn Gin Gly 

Ala Trp Ala 

Trp Ala Arg 
330 

Leu Pro Arg 
345 

Ser Pro Pro 
360 

Lys His His 

Val Glu Glu 



Gly Pro Met 
125 

Phe Lys Leu 

140 
Leu Gin Met 
155 

Phe Tyr Gin 

Glu lie Pro 

Pro lie Gin 
205 

Ser Ser Trp 

220 
Cys Trp Gin 
235 

Trp Cys Pro 

Thr Val Gly 

Gin Leu Ala 
285 

Cys Ala Trp 

300 
Arg Arg Ala 
315 

Gin Met Leu 

Gly Ala Giy 

Pro Arg Arg 
365 

Pro Leu Ser 

380 
Cys Gly His 
395 



Ala Ala He 

Leu Val Lys 

Val Leu Gin 
160 

Gly Pro Leu 
175 

Leu Trp Lys 
190 

Gin Glu Val 

Gly Val Leu 

Pro His Ser 
240 

Leu Trp Pro 
255 

Ala Ser Phe 
270 

Gly Arg Tyr 

Arg Leu Arg 

Pro His Ser 
320 

Cys Ser Thr 
335 

Leu Arg Glu 
350 

Gly Phe Leu 

Thr Val Pro 

Thr Ala Arg 
400 
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Asp Trp Gin Ala Val Pro Leu Ala Ala Leu Val Arg Asp Pro Leu Arg 

405 410 415 

Glu Ala Ser Trp Ala Pro Glu Ser Gly Gly Asp Leu Glu Asn Leu Tyr 
420 425 430 

5 Val 
433 



(2) INFORMATION FOR SEQ ID NO: 88: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 693 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

20 



CTTCCCCAAC 


TAATAAGGAC 


CCCCCTTTCA 


ACCCAAACAG 


TCCAAAAGGA 


CATAGACAAA 


60 


GGAGTAAACA 


ATGAACCAAA 


GAGTGCCAAT 


ATTCCCTGGT 


TATGCACCCT 


CCAAGCGGTG 


120 


GGAGAAGAAT 


TCGGCCCAGC 


CAGAGTGCAT 


GTACCTTTTT 


CTCTCTCACA 


CTTGAAGCAA 


180 


ATTAAAATAG 


ACNTAGGTNA 


ATTNTCAGAT 


AGCCCTGATG 


GYTATATTGA 


TGTTTTACAA 


240 


GGATTAGGAC 


AATCCTTTGA 


TCTGACATGG 


AGAGATATAA 


TATTACTGCT 


AAATCAGACG 


300 


CTAACCTCAA 


ATGAGAGAAG 


TGCTGCCATA 


ACTGGAGCCC 


GAGAGTTTGG 


CAATCTCTGG 


360 


TATCTCAGTC 


AGGTCAATGA 


TAGGATGACA 


ACGGAGGAAA 


GAGAACGATT 


CCCCACAGGG 


420 


CAGCAGGCAG 


TTCCCAGTGT 


AGCTCCTCAT 


TGGGACACAG 


AATCAGAACA 


TGGAGATTGG 


480 


TGCCGCAGAC 


ATTTACTAAC 


TTGCGTGCTA 


GAAGGACTAA 


GGAAAACTAG 


GAAGACTATG 


540 


AATTATTCAA 


TGATGTCCAC 


TATAACACAG 


GGGAAAGGAA 


GAAAATCCTA 


CTGCCTTTCT 


600 


GGAGAGACTA 


AGGGAGGCAT 


TGAGGAAGCA 


TACCAGGCAA 


GTGGACATTG 


GAGGCTCTGG 


660 


AAAAGGGAAA 


AGTTGGGCAA 


ATTGAATGCC 


TAA 






693 



35 (2) INFORMATION FOR SEQ ID NO: 89: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1577 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: 

10 





AACTTGCGTG 


CTAGAAGGAC 


TAAGGAAAAC 




CACTATAACA 


CAGGGGAAAG 


GAAGAAAATC 




CATTGAGGAA 


GCATACCAGG 


CAAGTGGACA 




CAAATTGAAT 


GCCTAATAGG 


GCTTGCTTCC 


15 


ATTGTCCAAG 


TAGAAATAAG 


CCGCCCCTCG 




GAAGGCCTAC 


TGCCCCAGGG 


GACGAAGGTC 




CCAGCAGCAG 


GACTGAGGGT 


GCCCGGGGCA 




CCCGGGTATG 


TTTGACCATT 


GAGAGCCAGG 




CCTTCTCAGT 


CTTACTTTCC 


TGTCCCAGAC 


20 


GGGTCCTAAG 


ACAGCCAGTC 


ACTACATACT 




CTTTACTCTT 


TTCACATGCT 


TTTCTAATTA 




GAGACATTTT 


AGCAAAAGCA 


GGGGCCATTA 




TTTGCTGTCC 


CCTGCTTGAG 


GTU^GGAATTA 




ATGGACAAGC 


AAAGAATGCC 


CGTCCTGTTC 


25 


CCTACCAAAG 


GAAGTACCCT 


CTTAGACCCG 




GGACCTAAAA 


GCCCAAGGCC 


TAGTAAAACC 




AGGAGTAAGG 


AAACCCAACG 


GACAGTGGAG 




GGCTGTTTTT 


CCTCTATACC 


CAGCTGTATC 




AGAGGAAGCA 


GAGTAGTTTA 


CAGTCCTGGA 


30 


ACATCCTGAT 


TCTCAATTCT 


TGTTTGTCTT 




CACCTGGACT 


GTTTTACCCC 


AGGGGTTCCG 




AGCCCAAGAC 


TTGAGCCAAT 


TCTCATACCT 




TAATTTTAGC 


CACCCGTTCA 


GAAACCTTGT 




TCCTCACTCC 


GTGTGGCTAC 


AAGGTTTCCA 


35 


TTAAATACTT 


AGGGTTAAAA 


TTATCCAAAG 




AACCTGTACT 


GGCTTATCTT 


CATCCCAAAA 



ID NO: 89: 



TAGGAAGACT 


ATGAATTATT 


CAATGATGTC 


60 


CTACTGCCTT 


TCTGGAGAGA 


CTAAGGGAGG 


120 


TTGGAGGCTC 


TGGAAAAGGG 


AAAAGTTGGG 


180 


AGTGCAGTCT 


ACAAGGACGC 


TTTAGAAAAG 


240 


TCCATGCCCC 


TTATGTCAAG 


GGAATCACTG 


300 


CTCTGAGTCA 


GAAGCCACTA 


ACCTGATGAT 


360 


AGTGCCAGCC 


CATGCCATCA 


CCCTCAGAGC 


420 


AAGTTAACTG 


TCTCCTGGAC 


ACTGGCGCAG 


480 


AATTGTCCTC 


CAGATCTGTC 


ACTATCCGAG 


540 


TCTCTCAGCC 


ACTAAGTTGT 


GACTGGGGAA 


600 


TGCCTGAAAG 


CCCCACTCCC 


TTGTTAGGGA 


660 


TACACCTGAA 


CATAGGAAAA 


GGAATACCCA 


720 


ATCCTGAAGT 


CTGGGCAATA 


GAAGGACAAT 


780 


AAGTTAAACT 


AAAGGATTCT 


GCCTCCTTTC 


840 


AGGCCCTACA 


AGGACTCAAA 


AGATTGTTAA 


900 


ATGCAGTAGC 


CCCTGCAATA 


CTCCAATTTT 


960 


GTTAGTGCAA 


GATCTCAGGA 


TTATTAATGA 


1020 


TAGCCCTTAT 


ACTCTGCTTT 


CCCTAATACC 


1080 


CCTTAAGGAT 


GCCTCTTTCT 


GCATCCCTGT 


1140 


TGAAGATCCT 


TTGAACCCAA 


TGTCTCAATT 


1200 


GGATAGCCCC 


CATCTATTTG 


GCCAGGCATT 


1260 


GGACATCTTG 


TCCTTCGGTA 


TGGGATGATT 


1320 


GCCATCAAGC 


CACCCAAGCG 


TTCTTAAATT 


1380 


AACCAAAGGC 


TCAGCTCTGC 


TCACAGCAGG 


1440 


GCACCAGGGC 


CCTCTGTGAG 


GAATGTATCC 


1500 


CCCTAAAGCA 


ACTAAGAAGG 


TCCTTGGCAT 


1560 
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AACAGGTTTC TGCCGAA 157 7 



(2) INFORMATION FOR SEQ ID NO: 90: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 amino acids 

(B) TYPE: amino acid 

10 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Ser Ser Ser Arg Thr Glu Gly Ala Arg Gly Lys Cys Gin Pro Met Pro 
15 1 5 10 15 

Ser Pro Ser Glu Pro Arg Val Cys Leu Thr He Glu Ser Gin Glu Val 

20 25 30 

Asn Cys Leu Leu Asp Thr Gly Ala Ala Phe Ser Val Leu Leu Ser Cys 
35 40 45 

20 Pro Arg Gin Leu Ser Ser Arg Ser Val Thr He Arg Gly Val Leu Arg 

50 55 60 

Gin Pro Val Thr Thr Tyr Phe Ser Gin Pro Leu Ser Cys Asp Trp Gly 
65 70 75 80 

Thr Leu Leu Phe Ser His Ala Phe Leu He Met Pro Glu Ser Pro Thr 
25 85 90 95 

Pro Leu Leu Gly Arg Asp He Leu Ala Lys Ala Gly Ala He He His 

100 105 110 

Leu Asn He Gly Lys Gly He Pro He Cys Cys Pro Leu Leu Glu Glu 
115 120 125 

30 Gly He Asn Pro Glu Val Trp Ala He Glu Gly Gin Tyr Gly Gin Ala 

130 135 140 

Lys Asn Ala Arg Pro Val Gin Val Lys Leu Lys Asp Ser Ala Ser Phe 
145 150 155 160 

Pro Tyr Gin Arg Lys Tyr Pro Leu Arg Pro Glu Ala Leu Gin Gly Leu 
35 165 170 175 

Lys Arg Leu Leu Arg Thr 
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180 



(2) INFORMATION FOR SEQ ID NO: 91: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
AGATCTGCAG AATTCGATAT CACCCCCCCC CCCCCC 36 



15 



(2) INFORMATION FOR SEQ ID NO: 92: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

( B ) TYPE : nucleot ide 

(C) STRANDEDNESS: single 
2 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

30 

AGATCTGCAG AATTCGATAT CA 22 

(2) INFORMATION FOR SEQ ID NO: 93: 
(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 2304 base pairs 

(B) TYPE: nucleotide 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 



5 


TCCAGCAGCA 


GGACTGAGGG 


TGCCCGGGGC 


AAGTGCCAGC 


CCATGCCATC 


50 




ACCCTCAGAG 


CCCCGGGTAT 


GTTTGACCAT 


TGAGAGCCAG 


GAAGTTAACT 


100 




GTCTCCTGGA 


CACTGGCGCA 


GCCTTCTCAG 


TCTTACTTTC 


CTGTCCCAGA 


150 




CAATTGTCCT 


CCAGATCTGT 


CACTATCCGA 


GGGGTCCTAG 


GACAGCCAGT 


200 




CACTACATAC 


TTCTCTCAGC 


CACTAAGTTG 


TGACTGGGGA 


ACTTTACTCT 


250 


10 


TTTCACATGC 


TTTTCTAATT 


ATGCCTGAAA 


GCCCCACTCC 


CTTGTTAGGG 


300 




AGAGACATTT 


TAGCAAAAGC 


AGGGGCCATT 


ATACACCTGA 


ACATAGGAAA 


350 




AGGAATACCC 


ATTTGCTGTC 


CCCTGCTTGA 


GGAAGGAATT 


AATCCTGAAG 


400 




TCTGGGCAAT 


AGAAGGACAA 


TATGGACAAG 


CAAAGAATGC 


CCGTCCTGTT 


450 




CAAGTTAAAC 


TAAAGGATTC 


TGCCTCCTTT 


CCCTACCAAA 


GGAAGTACCC 


500 


15 


TCTTAGACCC 


GAGGCCCTAC 


AAGGANCTCA 


AAAGATTGTT 


AAGGACCTAA 


550 




AAGCCCAAGG 


CCTAGTAAAA 


CCATGCAGTA 


GCCCCTGCAA 


TACTCCAATT 


600 




TTAGGAGTAA 


GGAAACCCAA 


CGGACAGTGG 


AGGTTAGTGC 


AAGATCTCAG 


650 




GATTATTAAT 


GAGGCTGTTT 


TTCCTCTATA 


CCCAGCTGTA 


TCTAGCCCTT 


700 




ATACTCTGCT 


TTCCCTAATA 


CCAGAGGAAG 


CAGAGTGGTT 


TACAGTCCTG 


750 


20 


GACCTTAAGG 


ATGCCTTTTT 


CTGCATCCCT 


GTACGTCCTG 


ACTCTCAATT 


800 




CTTGTTTGCC 


TTTGAAGATC 


CTTTGAACCC 


AACGTCTCAA 


CTCACCTGGA 


850 




CTGTTTTACC 


CCAAGGGTTC 


AGGGATAGCC 


CCCATCTATT 


TGGCCAGGCA 


900 




TTAGCCCAAG 


ACTTGAGTCA 


ATTCTCATAC 


CTGGACACTC 


TTGTCCTTCA 


950 




GTACGTGGAT 


GATTTACTTT 


TAGTCGCCCG 


TTCAGAAACC 


TTGTGCCATC 


1000 


25 


AAGCCACCCA 


AGAACTCTTA 


ACTTTCCTCA 


CTACCTGTGG 


CTACAAGGTT 


1050 




TCCAAACCAA 


AGGCTCGGCT 


CTGCTCACAG 


GAGATTAGAT 


ACTTAGGGCT 


1100 




AAAATTATCC 


AAAGGCACCA 


GGGCCCTCAG 


TGAGGAACGT 


ATCCAGCCTA 


1150 




TACTGGCTTA 


TCCTCATCCC 


AAAACCCTAA 


AGCAACTAAG 


AGGGTTCCTT 


1200 




GGCATAACAG 


GTTTCTGCCG 


AAAACAGATT 


CCCAGGTACA 


CCCCAATAGC 


1250 


30 


CAGACCATTA 


TATACACTAA 


TTAGGGAAAC 


TCAGAAAGCC 


AATACCTATT 


1300 




TAGTAAGATG 


GACACCTACA 


GAAGTGGCTT 


TCCAGGCCCT 


AAAGAAGGCC 


1350 




CTAACCCAAG 


CCCCAGTGTT 


CAGCTTGCCA 


ACAGGGCAAG 


ATTTTTCTTT 


1400 




ATATGCCACA 


GAAAAAACAG 


GAATAGCTCT 


AGGAGTCCTT 


ACGCAGGTCT 


1450 




CAGGGATGAG 


CTTGCAACCC 


GTGGTATACC 


TGAGTAAGGA 


AATTGATGTA 


1500 


35 


GTGGCAAAGG 


GTTGGCCTCA 


TTGTTTATGG 


GTAATGGCGG 


CAGTAGCAGT 


1550 




CTTAGTATCT 


GAAGCAGTTA 


AAATAATACA 


GGGAAGAGAT 


CTTACTGTGT 


1600 
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GGACATCTCA 


TGATGTGAAC 


GGCATACTCA 


CTGCTAAAGG 


AGACTTGTGG 


1650 


TTGTCAGACA 


ACCATTTACT 


TAATTATCAG 


GCTCTATTAC 


TTGAAGAGCC 


1700 


AGTGCTGAGA 


CTGCGCACTT 


GTGCAACTCT 


TAAACCCGCC 


ACATTTCTTC 


1750 


CAGACAATGA 


AGAAAAGATA 


GAACATAACT 


GTCAACAAGT 


AATTGCTCAA 


1800 


ACCTATGCTG 


CTCGAGGGGA 


CCTTCTAGAG 


GTTCCCTTGA 


CTGATCCCGA 


1850 


CCTCAACTTG 


TATACTGATG 


GAAGTTCCTT 


GGCAGAAAAA 


GGACTTCGAA 


1900 


AAGCGGGGTA 


TGCAGTGATC 


AGTGATAATG 


GAATACTTGA 


AAGTAATCGC 


1950 


CTCACTCCAG 


GAACTAGTGC 


TCACCTGGCA 


GAACTAATAG 


CCCTCACTTG 


2000 


GGCACTAGAA 


TTAGGAGAAG 


GAAAAAGGGT 


AAATATATAT 


TCAGACTCTA 


2050 


AGTATGCTTA 


CCTAGTCCTC 


CATGCCCATG 


CAGCAATATG 


GAGAGAGAGG 


2100 


GAATTCCTAA 


CTTCTGAGGG 


AACACCTATC 


AACCATCAGG 


AAGCCATTAG 


2150 


GAGATTATTA 


TTGGCTGTAC 


AGAAACCTAA 


AGAGGTGGCA 


GTCTTACACT 


2200 


GCCAGGGTCA 


TCAGGAAGAA 


GAGGAAAGGG 


AAATAGAAGG 


CAATCGCCAA 


2250 


GCGGATATTG 


AAGCAAAAAA 


AGCCGCAAGG 


CAGGACTCTC 


CATTAGAAAT 


2300 


GCTT 










2304 



(2) INFORMATION FOR SEQ ID NO: 94: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2364 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 



ATGATCCAGC 


AGCAGGACNG 


AGGGTGCCCG 


GGGCAAGCGC 


CAGCCCATGC 


50 


CATCACCCTC 


ACAGAGCCCC 


AGGTATGCTT 


GACCATTGAG 


GGTCAGAAGG 


100 


GTNACTGTCT 


CCTGGACACT 


GGCGGNGCCT 


TCTCAGTCTT 


ACTTTCCTGT 


150 


CCTGGACAAC 


TGTCCTCCAG 


ATCTGTCACT 


GTCCGAGGGG 


TCCTAGGACA 


200 


GCCAGTCACT 


AGATACTTCT 


CCCAGCCACT 


AAGTTGTGAC 


TGGGGAACTT 


250 


TACTCTTCCC 


ACATGCTTTT 


CTAATTATGC 


CTGAAAGCCC 


CACTCTCTTG 


300 


TTGGGGAGAG 


ACATTCTAGC 


AAAAGCAGGG 


GCCATTATAC 


ATGTGAATAT 


350 


AGGAGAAGGA 


ACAACTGTTT 


GTTGTCCCCT 


GCTTGAGGAA 


GGAATTAATC 


400 


CTGAAGTCCG 


GGCAACAGAA 


GGACAATATG 


GACAAGCAAA 


GAATGCCCGT 


450 


CCTGTTCAAG 


TTAAACTAAA 


GGATTCCACC 


TCCTTTCCCT 


ACCAAAGGCA 


500 


GTACCCCCTC 


AGACCCGAGA 


CCCAACAAGA 


ACTCCAAAAG 


ATTGTAAAGG 


550 


ACCTAAAAGC 


CCAAGGCCTA 


GTAAAACCAA 


GCAATAGCCC 


TTGCAAGACT 


600 
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CCAATTTTAG GAGTAAGGAA ACCCAACGGA 
ACTCAGGATT ATCAATGAGG CTGTTGTTCC 
ACCCTTATAC AGTGCTTTCC CAAATACCAG 
GTCCTGGACC TTAAGGATGC CTTTTTCTGC 
5 TCAATTCTTG TTTGCCTTTG AAGATCCTTT 
CCTGGACTGT TTTACCCCAA GGGTTCAGGG 
CAGGCATTAG CCCAAGACTT GAGTCAATTC 
CCTTCAGTAC ATGGATGATT TACTTTTAGT 
GCCATCAAGC CACCCAAGAA CTCTTAACTT 
10 AAGGTTTCCA AACCAAAGGC TCGGCTCTGC 
AGGGCTAAAA TTATCCAAAG GCACCAGGGC 
AGCCTATACT GGCTTATCCT CATCCCAAAA 
TTCCTTGGCA TAACAGGTTT CTGCCGAAAA 
AATAGCCAGA CCATTATATA CACTAATTAN 
15 CCTATTTAGT AAGATGGACA CCTACAGAAG 
AAGGCCCTAA CCCAAGCCCC AGTGTTCAGC 
TTCTTTATAT GCCACAGAAA AAACAGGAAT 
AGGTCTCAGG GATGAGCTTG CAACCCGTGG 
GATGTAGTGG CAAAGGGTTG GCCTCATNGT 
2 0 AGCAGTCTNA GTATCTGAAG CAGTTAAAAT 
CTGTGTGGAC ATCTCATGAT GTGAACGGCA 
TTGTGGTTGT CAGACAACCA TTTACTTAAN 
AGAGCCAGTG CTGNGACTGC GCACTTGTCC 
TGCTGCCCAG AAGGATCTTT NTAGAGGTCC 

2 5 AACTATATAT ATACTGATGG AAGTTCGTTT 

GGNAGGATAT NCCATAGGTG TTAGTGATAA 
CTCTTCCCCC CCAGGGACCA GCGCCCCCGT 
ACCCCGCGAG CCTTAGAACT TTGGAAAGGG 
AGATAGCAAG TATGCTTATC TAATCCGAAA 

3 0 AAGAAAGGGA GTTCCTAACC TCTGGGGGAA 

TTAATCATGG AGTTATTGCA CACAGTGCAA 
CTTACACTGC CAAAGCCATC AGAAAAGGGA 
AGTGGCTACA GAGGCAAGGA AAGACTAGCA 
ACAGAAAGTC AGAGAGAGAG AGAGGAAGAG 
3 5 CAGAGAGAGA GAGAGACAGA GAGTCAGAGA 
GAGACAAAGA ATGAH 
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CAGTGGAGGT 


TAGTGCAAGA 


650 


TCTATACCCA 


GCTGTACCTA 


700 


AGGAAGCAGA 


GTGGTTTACA 


750 


ATCCCTGTAC 


GTCCTGACTC 


800 


GAACCCAACG 


TCTCAACTCA 


850 


ATAGCCCCCA 


TCTATTTGGC 


900 


TCATACCTGG 


ACACTCTTGT 


950 


CGCCCGTTCA 


GAAACCTTGT 


1000 


TCCTCACTAC 


CTGTGGCTAC 


1050 


TCACAGGAGA 


TTAGATACTN 


1100 


CCTCAGTGAG 


GAACGTATCC 


1150 


CCCTAAAGCA 


ACTAAGAGGG 


1200 


CAGATTCCCA 


GGTACASCCC 


1250 


GGAAACTCAG 


AAAGCCAATA 


1300 


TGGCTTTCCA 


GGCCCTAAAG 


1350 


TTGCCAACAG 


GGCAAGATTT 


1400 


AGCTCTAGGA 


GTCCTTACGC 


1450 


TATACCTGAG 


TTU^GGAAATT 


1500 


TTATGGGTAA 


TGGNGGCAGT 


1550 


AATACAGGGA 


AGAGATCTTN 


1600 


TACTSRCTGC 


TAAAGGAGAC 


1650 


TAYCAGGCYY 


TATTACTTGA 


1700 


AACTCTTAAA 


CCCAAACTTA 


1750 


CCTTAGCCAA 


CCCTGACCTC 


1800 


GTAGAAAAGG 


GATTACAAAG 


1850 


AGCAGTACTT 


GAAAGTAAGC 


1900 


TAGCAGAACT 


AGTGGCACTG 


1950 


AGGAGGATAA 


ATGTGTATAC 


2000 


TGCCCATGTT 


GCAATATGGA 


2050 


CCCCCATTAA 


ATACCACAAG 


2100 


AAACTCAAGG 


AGGTGGAAGT 


2150 


AAGAGGGGAA 


GAGCAGCATA 


2200 


GAAAGGAAAG 


AGAGAAAGAG 


2250 


ACAGAGCACA 


AAGAGGGAGT 


2300 


GAAGGAAAGA 


GAGAGAGGAA 


2350 






2365 
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(2) INFORMATION FOR SEQ ID NO: 95: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 768 amino acids 
5 (B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 





SSSRTEGARG 


KCQPMPSPSE 


PRVCLTIESQ 


EVNCLLDTGA 


AFSVLLSCPR 


50 




QLSSRSVTIR 


GVLGQPVTTY 


FSQPLSCDWG 


TLLFSHAFLI 


MPESPTPLLG 


100 


10 


RDILAKAGAI 


IHLNIGKGIP 


ICCPLLEEGI 


NPEVWAIEGQ 


YGQAKNARPV 


150 




QVKLKDSASF 


PYQRKYPLRP 


EALQGXQKIV 


KDLKAQGLVK 


PCSSPCNTPI 


200 




LGVRKPNGQW 


RLVQDLRIIN 


EAVFPLYPAV 


SSPYTLLSLI 


PEEAEWFTVL 


250 




DLKDAFFCIP 


VRPDSQFLFA 


FEDPLNPTSQ 


LTWTVLPQGF 


RDSPHLFGQA 


300 




LAQDLSQFSY 


LDTLVLQYVD 


DLLLVARSET 


LCHQATQELL 


TFLTTCGYKV 


350 


15 


SKPKARLCSQ 


EIRYLGLKLS 


KGTRALSEER 


IQPILAYPHP 


KTLKQLRGFL 


400 




GITGFCRKQI 


PRYTPIARPL 


YTLIRETQKA 


NTYLVRWTPT 


EVAFQALKKA 


450 




LTQAPVFSLP 


TGQDFSLYAT 


EKTGIALGVL 


TQVSGMSLQP 


WYLSKEIDV 


500 




VAKGWPHCLW 


VMAAVAVLVS 


EAVKIIQGRD 


LTVWTSHDVN 


GILTAKGDLW 


550 




LSDNHLLNYQ 


ALLLEEPVLR 


LRTCATLKPA 


TFLPDNEEKI 


EHNCQQVIAQ 


600 


20 


TYAARGDLLE 


VPLTDPDLNL 


YTDGSSLAEK 


GLRKAGYAVI 


SDNGILESNR 


650 




LTPGTSAHLA 


ELIALTWALE 


LGEGKRVNIY 


SDSKYAYLVL 


HAHAAIWRER 


700 




EFLTSEGTPI 


NHQEAIRRLL 


LAVQKPKEVA 


VLHCQGHQEE 


EEREIEGNRQ 


750 




ADIEAKKAAR 


QDSPLEML 








768 



25 (2) INFORMATION FOR SEQ ID NO: 96: 
(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 114 amino acids 

(B) TYPE; peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

30 

SSSRTEGARG KCQPMPSPSE PRVCLTIESQ EVNCLLDTGA AFSVLLSCPR 50 
QLSSRSVTIR GVLGQPVTTY FSQPLSCDWG TLLFSHAFLI MPESPTPLLG 100 
RDILAKAGAI IHLN 

3 5 (2) INFORMATION FOR SEQ ID NO: 97: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: amino acids 

(B) TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

5 IGKGIPICCPLLEEGINPEVWAIEGQYGQAKNARPV 

QVKLKDSASFPYQRKYPLRPEALQGXQKIVKDLKAQGLVKPCSSPCNTPI 
LGVRKPNGQWRLVQDLRIINEAVFPLYPAVSSPYTLLSLIPEEAEWFTVL 
DLKDAFFCIPVRPDSQFLFAFEDPLNPTSQLTWTVLPQGFRDSPHLFGQA 
LAQDLSQFSYLDTLVLQYVDDLLLVARSETLCHQATQELLTFLTTCGYKV 

10 SKPKARLCSQEIRYLGLKLSKGTRALSEERIQPILAYPHPKTLKQLRGFL 
GITGFCRKQIPRYTPIARPLYTLIRETQKANTYLVRWTPTEVAFQALKKA 
LTQAPVFSLPTGQDFSLYATEKTGIALGVLTQVSGMSLQPWYLSKEIDV 
VAKGWPHCLWVMAAVAVLVSEAVKIIQGRDLTVWTSHDVNGILTAKGDLW 
LSDNHLLNYQALLLEEPVLRLRTCATLKPATFLPDNEEKIEHNCQQVIAQ 

15 TYAARGDLLEVPLTDPDLNLYTDGSSLAEKGLRKAGYAVISDNGILESNR 
LTPGTSAHLAELIALTWALELGEGKRVNIYSDSKYAYLVLHAHAAIWRER 
EFLTSEGTPINHQEAIRRLLLAVQKPKEVAVLHCQGHQEEEEREIEGNRQ 
AD I E AKKAARQD S P LEML 

20 

(2) INFORMATION FOR SEQ ID NO: 98: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: amino acids 

(B) TYPE: peptide 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

LYTDGSSLAEKGLRKAGYAVISDNGILESNR 

LTPGTSAHLAELIALTWALELGEGKRVNIYSDSKYAYLVLHAHAAIWRER 
EFLTSEGTPINHQEAIRRLLLAVQKPKEVAVLHCQGHQEEEEREIEGNRQ 
3 0 ADIEAKKAARQDSPLEML 

(2) INFORMATION FOR SEQ ID NO: 99 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 
AGGAGTAAGG AAACCCAACG GAC 23 

5 (2) INFORMATION FOR SEQ ID NO: 100 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 
TAAGAGTTGC ACAAGTGCG 1^ 



(2) INFORMATION FOR SEQ ID NO: 101 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

TCAGGGATAG CCCCCATCTA T 



(2) INFORMATION FOR SEQ ID NO: 102 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
3 0 AACCCTTTGC CACTACATCA ATTT 



(2) INFORMATION FOR SEQ ID NO: 103 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 
AGCAGCAGGA CTGAGGGT 18 

5 (2) INFORMATION FOR SEQ ID NO: 104 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
CTGTCCGTTG GGTTTCCTTA CTCCT 2 5 

(2) INFORMATION FOR SEQ ID NO: 105 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

GACAGCAAAT GGGTATTCCT TTCC 24 

(2) INFORMATION FOR SEQ ID NO: 106 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 24 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 
3 0 AGGAGTAAGG AAACCCAACG GACA 24 

(2) INFORMATION FOR SEQ ID NO: 107 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
TGTATATAAT GGTCTGGCTA TTGGG 25 



5 (2) INFORMATION FOR SEQ ID NO: 108 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
TTCGGCAGAA ACCTGTTATG CCAAGG 26 



(2) INFORMATION FOR SEQ ID NO: 109 
15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

GGCTCTGCTC ACAGGAGATT AGATAC 2 6 



(2) INFORMATION FOR SEQ ID NO: 110 
(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
3 0 AAAGGCACCA GGGCCCTCAG TGAGGA 26 



(2) INFORMATION FOR SEQ ID NO: 111 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: base pairs 
35 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 
GGTTTAAGAG TTGCACAAGT GCGCAGTC 28 

5 (2) INFORMATION FOR SEQ ID NO: 112: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 310 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

GCTTATAGAA GGACCCCTAG TATGGGGTAA TCCCCTCTGG GAAACCAAGC CCCAGTACTC 60 

AGCAGGAAAA ATAGAATAGG AAACCTCACA AGGACATACT TTCCTCCCCT CCAGATGGCT 120 

15 AGCCACTGAG GAAGGAAAAA TACTTTCACC TGCAGCTAAC CAACAGAAAT TACTTAAAAC 180 

CCTTCACCAA ACCTTCCACT TAGGCATTGA TAGCACCCAT CAGATGGCCA AATTATTATT 2 40 

TACTGGACCA GGCCTTTTCA AAACTATCAA GAAGATAGTC AGGGGCTGTG AAGTGTGCCA 300 

AAGAAATAAT 310 

2 0 (2) INFORMATION FOR SEQ ID NO: 113: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
2 5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
Leu lie Glu Gly Pro Leu Val Trp Gly Asn Pro Leu Trp Glu Thr Lys 
15 10 15 

30 Pro Gin Tyr Ser Ala Gly Lys lie Glu Xaa Glu Thr Ser Gin Gly His 

20 25 30 

Thr Phe Leu Pro Ser Arg Trp Leu Ala Thr Glu Glu Gly Lys lie Leu 

35 40 45 

Ser Pro Ala Ala Asn Gin Gin Lys Leu Leu Lys Thr Leu His Gin Thr 
35 50 55 60 

Phe His Leu Gly lie Asp Ser Thr His Gin Met Ala Lys Leu Leu Phe 
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65 70 75 80 

Thr Gly Pro Gly Leu Phe Lys Thr lie Lys Lys lie Val Arg Gly Cys 

85 90 95 

Glu Val Cys Gin Arg Asn Asn 
5 100 



(2) INFORMATION FOR SEQ ID NO: 114: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 635 base pairs 
10 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

15 CCCTGTATCT TTAACCTCCT TGTTAAGTTT GTCTCTTCCA GAATCAAAAC TGTAAAACTA ^ 60 
CAAATTGTTC TTCAAATGGA GCACCAGATG GAGTCCATGA CTAAGATCCA CCGTGGACCC 120 
CTGGACCGGC CTGCTAGCCC ATGCTCCGAT GTTAATGACA TTGAAGGCAC CCCTCCCGAG 180 
GAAATCTCAA CTGCACAACC CCTACTATGC CCCAATTCAG CGGGAAGCAG TTAGAGCGGT 240 
CATCAGCCAA CCTCCCCAAC AGCACTTGGG TTTTCCTGTT GAGAGGGGGG ACTGAGAGAC 300 

2 0 AGGACTAGCT GGATTTCCTA GGCCAACGAA GAATCCCTAA GCCTAGCTGG GAAGGTGACT 360 
GCATCCACCT CTAAACATGG GGCTTGCAAC TTAGCTCACA CCCGACCAAT CAGAGAGCTC 420 
ACTAAAATGC TAATTAGGCA AAAATAGGAG GTAAAGAAAT AGCCAATCAT CTATTGCCTG 480 
AGAGCACAGC GGGAGGGACA AGGATCGGGA TATAAACCCA GGCATTCGAG CCGGCAACGG 540 
CAACCCCCTT TGGGTCCCCT CCCTTTGTAT GGGCGCTCTG TTTTCACTCT ATTTCACTCT 600 

2 5 ATTAAATCTT GCAACTGAAA AAAAAAAAAA AAAAA 635 

(2) INFORMATION FOR SEQ ID NO: 115: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 77 amino acids 

3 0 ( B ) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 
3 5 Pro Cys lie Phe Asn Leu Leu Val Lys Phe Val Ser Ser Arg lie Lys 

15 10 15 
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Thr Val Lys Leu Gin lie Vai Leu Gin Met Glu His Gin Met: Giu Ser 



20 



25 



30 



Met Thr Lys lie His Arg Giy Pro Leu Asp Arg Pro Ala Ser Pro Cys 



35 



40 



45 



Ser Asp Val Asn Asp lie Glu Gly Thr Pro Pro Glu Glu lie Ser Thr 



50 



55 



60 



Ala Gin Pro Leu Leu Cys Pro Asn Ser Ala Gly Ser Ser 



65 



70 



75 



10 



15 



20 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 116: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 
TGGGGTTCCA TTTGTAAGAC CATCTGTAGC TT 32 

(2) INFORMATION FOR SEQ ID NO: 117: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1481 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 
ATGGCCCTCC CTTATCATAC TTTTCTCTTT ACTGTTCTCT TACCCCCTTT CGCTCTCACT 60 
GCACCCCCTC CATGCTGCTG TACAACCAGT AGCTCCCCTT ACCAAGAGTT TCTATGAAGA 120 
ACGCGGCTTC CTGGAAATAT TGATGCCCCA TCATATAGGA GTTTATCTAA GGGAAACTCC 180 
ACCTTCACTG CCCACACCCA TATGCCCCGC AACTGCTATA ACTCTGCCAC TCTTTGCATG 240 
CATGCAAATA CTCATTATTG GACAGGGAAA ATGATTAATC CTAGTTGTCC TGGAGGACTT 300 
GGAGCCACTG TCTGTTGGAC TTACTTCACC CATACCAGTA TGTCTGATGG GGGTGGAATT 360 
CAAGGTCAGG CAAGAGAAAA ACAAGTAAAG GAAGCAATCT CCCAACTGAC CCGGGGACAT 420 
AGCACCCCTA GCCCCTACAA AGGACTAGTT CTCTCAAAAC TACATGAAAC CCTCCGTACC 480 
CATACTCGCC TGGTGAGCCT ATTTAATACC ACCCTCACTC GGCTCCATGA GGTCTCAGCC 540 
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CAAAACCCTA CTAACTGTTG GATGTGCCTC 
CCTGTTCCTG AACAATGGAA CAACTTCAGC 
GGACCTCTTG TTTCCAATCT GGAAATAACC 
AGCAATACTA TAGACACAAC CAGCTCCCAA 
5 ATAGTCTGCC TACCCTCAGG AATATTTTTT 
AATGGCTCTT CAGAATCTAT GTGCTTCCTC 
ACTGAACAAG ATTTATACAA TCATGTCGTA 
CTTCCTTTTG TTATCAGAGC AGGAGTGCTA 
ACAACCTCTA CTCAGTTCTA CTACAAACTA 

10 GTCACTGACT CCCTGGTCAC CTTGCAAGAT 
CAAAATCGAA GAGCTTTAGA CTTGCTAACC 
GGAGAAGAAC GCTGTTATTA TGTTAATCAA 
ATTCGAGATC GAATACAATG TAGAGCAGAG 
CTCAGCCAAT GGATGCCCTG GGTTCTCCCC 

15 TTACTCCTCT TTGGACCCTG TATCTTTAAC 
GAAGCTGTAA AGCTACAGAT GGTCTTACAA 
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CCCCTGCACT 


TCAGGCCATA 


CATTTCAATC 


600 


ACAGAAATAA 


ACACCACTTC 


CGTTTTAGTA 


660 


CATACCTCAA 


ACCTCACCTG 


TGTAAAATTT 


720 


TGCATCAGGT 


GGGTAACACC 


TCCCACACGA 


780 


GTCTGTGGTA 


CCTCAGCCTA 


TCATTGTTTG 


840 


TCATTCTTAG 


TGCCCCCTAT 


GACCATCTAC 


900 


CCTAAGCCCC 


ACAACAAAAG 


AGTACCCATT 


960 


GGCAGACTAG 


GTACTGGCAT 


TGGCAGTATC 


1020 


TCTCAAGAAA 


TAAATGGTGA 


CATGGAACAG 


1080 


CAACTTAACT 


CCCTAGCAGC 


AGTAGTCCTT 


1140 


GCCAAAAGAG 


GGGGAACCTG 


TTTATTTTTA 


1200 


TCCAGAATTG 


TCACTGAGAA 


AGTTAAAGAA 


1260 


GAGCTTCAAA 


ACACCGAACG 


CTGGGGCCTC 


1320 


TTCTTAGGAC 


CTCTAGCAGC 


TCTAATATTG 


1380 


CTCCTTGTTA 


AGTTTGTCTC 


TTCCAGAATT 


1440 


ATGGAACCCC 


A 




1481 



(2) INFORMATION FOR SEQ ID NO: 118: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 493 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Met Ala Leu Pro Tyr His Thr Phe Leu Phe Thr Val Leu Leu Pro Pro 
15 10 15 

Phe Ala Leu Thr Ala Pro Pro Pro Cys Cys Cys Thr Thr Ser Ser Ser 
20 25 30 

30 Pro Tyr Gin Glu Phe Leu Xaa Arg Thr Arg Leu Pro Gly Asn lie Asp 

35 40 45 

Ala Pro Ser Tyr Arg Ser Leu Ser Lys Gly Asn Ser Thr Phe Thr Ala 

50 55 60 

His Thr His Met Pro Arg Asn Cys Tyr Asn Ser Ala Thr Leu Cys Met 
35 65 70 75 80 

His Ala Asn Thr His Tyr Trp Thr Gly Lys Met He Asn Pro Ser Cys 
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85 90 95 

Pro Gly Gly Leu Gly Ala Thr Val Cys Trp Thr Tyr Phe Thr His Thr 

100 105 110 

Ser Met Ser Asp Gly Gly Gly lie Gin Gly Gin Ala Arg Glu Lya Gin 
5 115 120 125 

Val Lys Glu Ala lie Ser Gin Leu Thr Arg Gly His Ser Thr Pro Ser 

130 135 140 

Pro Tyr Lys Gly Leu Val Leu Ser Lys Leu His Glu Thr Leu Arg Thr 
145 150 155 160 

10 His Thr Arg Leu Val Ser Leu Phe Asn Thr Thr Leu Thr Arg Leu His 

165 . 170 175 

Glu Val Ser Ala Gin Asn Pro Thr Asn Cys Trp Met Cys Leu Pro Leu 

180 185 190 

His Phe Arg Pro Tyr lie Ser lie Pro Val Pro Glu Gin Trp Asn Asn 
15 195 200 205 

Phe Ser Thr Glu lie Asn Thr Thr Ser Val Leu Val Gly Pro Leu Val 

210 215 220 

Ser Asn Leu Glu lie Thr His Thr Ser Asn Leu Thr Cys Val Lys Phe 
225 230 235 240 

20 Ser Asn Thr lie Asp Thr Thr Ser Ser Gin Cys lie Arg Trp Val Thr 

245 250 255 

Pro Pro Thr Arg lie Val Cys Leu Pro Ser Gly lie Phe Phe Val Cys 

260 265 270 

Gly Thr Ser Ala Tyr His Cys Leu Asn Gly Ser Ser Glu Ser Met Cys 
25 275 280 285 

Phe Leu Ser Phe Leu Val Pro Pro Met Thr lie Tyr Thr Glu Gin Asp 

290 295 300 

Leu Tyr Asn His Val Val Pro Lys Pro His Asn Lys Arg Val Pro lie 
305 310 315 320 

30 Leu Pro Phe Val lie Arg Ala Gly Val Leu Gly Arg Leu Gly Thr Gly 

325 330 335 

lie Gly Ser lie Thr Thr Ser Thr Gin Phe Tyr Tyr Lys Leu Ser Gin 

340 345 350 

Glu lie Asn Gly Asp Met Glu Gin Val Thr Asp Ser Leu Val Thr Leu 
35 355 360 365 

Gin Asp Gin Leu Asn Ser Leu Ala Ala Val Val Leu Gin Asn Arg Arg 
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370 375 380 

Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr Cys Leu Phe Leu 
385 390 395 400 

Gly Glu Glu Arg Cys Tyr Tyr Val Asn Gin Ser Arg lie Val Thr Glu 
5 405 410 415 

Lys Val Lys Glu He Arg Asp Arg He Gin Cys Arg Ala Glu Glu Leu 

420 425 430 

Gin Asn Thr Glu Arg Trp Gly Leu Leu Ser Gin Trp Met Pro Trp Val 
435 440 445 

10 Leu Pro Phe Leu Gly Pro Leu Ala Ala Leu He Leu Leu Leu Leu Phe 

450 455 460 

Gly Pro Cys He Phe Asn Leu Leu Val Lys Phe Val Ser Ser Arg He 
465 470 475 480 

Glu Ala Val Lys Leu Gin Met Val Leu Gin Met Glu Pro 
15 485 490 



{2) INFORMATION FOR SEQ ID NO: 119: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 32 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
2 5 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CG 



(2) INFORMATION FOR SEQ ID NO: 120: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1329 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 
3 5 TCAAAATCGA AGAGCTTTAG ACTTGCTAAC CGCCAAAAGA GGGGGAACCT GTTTATTTTT 60 
AGGGGAAGAA TGCTGTTAGT ATGTTAATCA ATCTGGAATC ATTACTGAGA AAGTTAAAGA 120 
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AATTTGAGAT CGAATATAAT GTAGAGCAGA GGACCTTCAA AACACTGCAC CCTGGGGCCT 180 

CCTCAGCCAA TGGATGCCCT GG ACTCTCCC CTTCTTAGGA CCTCTAGCAG CTATAATATT 240 

TTTACTCCTC TTTGGACCCT GTATCTTCAA CTTCCTTGTT AAGTTTGTCT CTTCCAGAAT 300 

TGAAGCTGTA AAGCTACAAA TAGTTCTTCA AATGGAACCC CAGATGCAGT CCATGACTAA 360 

5 AATCTACCGT GGACCCCTGG ACCGGCCTGC TAGACTATGC TCTGATGTTA ATGACATTGA 420 

AGTCACCCCT CCCGAGGAAA TCTCAACTGC ACAACCCCTA CTACACTCCA ATTCAGTAGG 480 

AAGCAGTTAG AGCAGTTGTC AGCCAACCTC CCCAACAGTA CTTGGGTTTT CCTGTTGAGA 540 

GGGTGGACTG AGAGACAGGA CTAGCTGGAT TTCCTAGGCT GACTAAGAAT CCCNAAGCCT 600 

ANCTGGGAAG GTGACCGCAT CCATCTTTAA ACATGGGGCT TGCAACTTAG CTCACACCCG 660 

10 ACCAATCAGA GAGCTCACTA AAATGCTAAT CAGGCAAAAA CAGGAGGTAA AGCAATAGCC 720 

AATCATCTAT TGCCTGAGAG CACAGCGGGA AGGACAAGGA TTGGGATATA AACTCAGGCA 780 

TTCAAGCCAG CAACAGCAAC CCCCTTTGGG TCCCCTCCCA TTGTATGGGA GCTCTGTTTT 840 

CACTCTATTT CACTCTATTA AATCATGCAA CTGCACTCTT CTGGTCCGTG TTTTTTATGG 900 

CTCAAGCTGA GCTTTTGTTC GCCATCCACC ACTGCTGTTT GCCACCGTCA CAGACCCGCT 960 

15 GCTGACTTCC ATCCCTTTGG ATCCAGCAGA GTGTCCACTG TGCTCCTGAT CCAGCGAGGT 1020 

ACCCATTGCC ACTCCCGATC AGGCTAAAGG CTTGCCATTG TTCCTGCATG GCTAAGTGCC 1080 

TGGGTTTGTC CTAATAGAAC TGAACACTGG TCACTGGGTT CCATGGTTCT CTTCCATGAC 1140 

CCACGGCTTC TAATAGAGCT ATAACACTCA CCGCATGGCC CAAGATTCCA TTCCTTGGTA 1200 

TCTGTGAGGC CAAGAACCCC AGGTCAGAGA ANGTGAGGCT TGCCACCATT TGGGAAGTGG 12 60 

2 0 CCCACTGCCA TTTTGGTAGC GGCCCACCAC CATCTTGGGA GCTGTGGGAG CAAGGATCCC 1320 

CCAGTAACA 1329 

(2) INFORMATION FOR SEQ ID NO: 121: 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 162 amino acids 

(B) TYPE: amino acid 

<C) STRANDEDNESS : single 



30 



(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 
Gin Asn Arg Arg Ala Leu Asp Leu Leu Thr Ala Lys Arg Gly Gly Thr 



1 



5 



10 



15 



Cys Leu Phe Leu Gly Glu Glu Cys Cys Xaa Tyr Val Asn Gin Ser Gly 



20 



25 



30 



35 



lie lie Thr Glu Lys Val Lys Glu lie Xaa Asp Arg lie Xaa Cys Arg 



35 



40 



45 
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Ala Glu Asp Leu 
50 

Met Pro Trp Thr 
65 

5 Leu Leu Leu Phe 

Ser Ser Arg lie 
100 

Pro Gin Met Gin 
10 115 

Pro Ala Arg Leu 
130 

Glu Glu lie Ser 
145 

15 Ser Ser 



189 

Gin Asn Thr Ala Pro Trp 
55 

Leu Pro Phe Leu Gly Pro 
70 

Gly Pro Cys lie Phe Asn 
85 90 
Glu Ala Val Lys Leu Gin 
105 

Ser Met Thr Lys lie Tyr 
120 

Cys Ser Asp Val Asn Asp 
135 

Thr Ala Gin Pro Leu Leu 
150 



Gly Leu Leu Ser Gin Trp 
60 

Leu Ala Ala lie lie Phe 
75 80 
Phe Leu Val Lys Phe Val 
95 

lie Val Leu Gin Met Glu 
110 

Arg Gly Pro Leu Asp Arg 
125 

lie Glu Val Thr Pro Pro 
140 

His Ser Asn Ser Val Gly 
155 160 



(2) INFORMATION FOR SEQ ID NO: 122: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

2 5 GGCATTGATA GCACCCATCA G 21 

(2) INFORMATION FOR SEQ ID NO: 123: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 3: 

3 5 CATGTCACCA GGGTGGAATA G 21 
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(2) INFORMATION FOR SEQ ID NO: 124: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 758 base pairs 

(B) TYPE: nucleotide 

5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 



GGCATTGATA 


GCACCCATCA 


GATGGCCAAA 


TCATTATTTA 


CTGGACCAGG 


CCTTTTCAAA 


60 


ACTATCAAGC 


AGATAGGGCC 


CGTGAAGCAT 


GCCAAAGAAA 


TAATCCCCTG 


CCTTATCGCC 


120 


ATGTTCCTTC 


AGGAGAACAA 


AGAACAGGCC 


ATTACCCAGG 


GGAAGACTGG 


CAACTAGATT 


180 


TTACCCACAT 


GGCCAAATGT 


CAGGGATTTC 


AGCATCTACT 


AGTCTGGGCA 


GATACTTTCA 


240 


CTGGTTGGGT 


GGAGTCTTCT 


CCTTGTAGGA 


CAGAAAAGAC 


CCAAGAGGTA 


ATAAAGGCAC 


300 


TAATGAAATA 


ATTCCCAGAT 


TTGGACTTCC 


CCCAGGATTA 


CAGGGTGACA 


ATGGCCCCGC 


360 


TTTCAAGGCT 


GCAGTAACCC 


AGGGAGTATC 


CCAGGTGTTA 


GGCATACAAT 


ATCACTTACA 


420 


CTGTGCCTGG 


AGGCCACAAT 


CCTCCAGAAA 


AGTCAAGAAA 


ATGAATGAAA 


CACTCAAAGA 


480 


TCTAAAAAAG 


CTAACCCAAG 


AAACCCACAT 


TGCATGACCT 


GTTCTGTTGC 


CTATAACCTT 


540 


ACTAAGAATC 


CATAACTATC 


CCCCAAAAAG 


CAGGACTTAG 


CCCATACGAG 


ATGCTATATG 


600 


GATGGCCTTT 


CCTAACCAAT 


GACCTTGTGC 


TTGACTGAGA 


AATGGCCAAC 


TTAGTTGCAG 


660 


ACATCACCTC 


CTTAGCCAAA 


TATCAACAAG 


TTCTTAAAAC 


ATCACAGGGA 


ACCTGTCCCC 


720 


GAGAGGAGGG 


AAAGGAACTA 


TTCCACCCTG 


GTGACATG 






758 



(2) INFORMATION FOR SEQ ID NO: 126: 
25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
3 0 (ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126; 
CGGACATCCA AAGTGATGGG AAACG 

(2) INFORMATION FOR SEQ ID NO: 127: 
3 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 
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(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

GGACAGGAAA GTAAGACTGA GAAGGC 26 



(2) INFORMATION FOR SEQ ID NO: 128: 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

CCTAGAACGT ATTCTGGAGA ATTGGG 26 



(2) INFORMATION FOR SEQ ID NO; 129: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

TGGCTCTCAA TGGTCAAACA TACCCG 26 



(2) INFORMATION FOR SEQ ID NO: 130: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1511 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

CCTAGAACGT ATTCTGGAGA ATTGGGACCA ATGTGACACT CAGACGCTAA GAAAGAAACG 60 
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ATTTATATTC 


TTCTGCAGTA 


CCGCCTGGCC 


ACAATATCCT 


CTTCAAGGGA 


GAGAAACCTG 


120 




GCTTCCTGAG 


GGAAGTATAA 


ATTATAACAT 


CATCTTACAG 


CTAGACCTCT 


TCTGTAGAAA 


180 




GGAGGGCAAA 


TGGAGTGAAG 


TGCCATATGT 


GCAAACTTTC 


TTTTCATTAA 


GAGACAACTC 


240 




ACAATTATGT 


AAAAAGTGTG 


GTTTATGCCC 


TACAGGAAGC 


CCTCAGAGTC 


CACCTCCCTA 


300 


5 


CCCCAGCGTC 


CCCTCCCCGA 


CTCCTTCCTC 


AACTAATAAG 


GACCCCCCTT 


TAACCCAAAC 


360 




GGTCCAAAAG 


GAGATAGACA 


AAGGGGTAAA 


CAATGAACCA 


AAGAGTGCCA 


ATATTCCCCG 


420 




ATTATGCCCC 


CTCCAAGCAG 


TGAGAGGAGG 


AGAATTCGGC 


CCAGCCAGAG 


TGCCTGTACC 


480 




TTTTTCTCTC 


TCAGACTTAA 


AGCAAATTAA 


AATAGACCTA 


GGTAAATTCT 


CAGATAACCC 


540 




TGACGGCTAT 


ATTGATGTTT 


TACAAGGGTT 


AGGACAATCC 


TTTGATCTGA 


CATGGAGAGA 


600 


10 


TATAATGTTA 


CTACTAAATC 


AGACACTAAC 


CCCAAATGAG 


AGAAGTGCCG 


CTGTAACTGC 


660 




AGCCCGAGAG 


TTTGGCGATC 


TTTGGTATCT 


CAGTCAGGCC 


AACAATAGGA 


TGACAACAGA 


720 




GGAAAGAACA 


ACTCCCACAG 


GCCAGCAGGC 


AGTTCCCAGT 


GTAGACCCTC 


ATTGGGACAC 


780 




AGAATCAGAA 


CATGGAGATT 


GGTGCCACAA 


ACATTTGCTA 


ACTTGCGTGC 


TAGAAGGACT 


840 




GAGGAAAACT 


AGGAAGAAGC 


CTATGAATTA 


CTCAATGATG 


TCCACTATAA 


CACAGGGAAA 


900 


15 


GGAAGAAAAT 


CTTACTGCTT 


TTCTGGACAG 


ACTAAGGGAG 


GCATTGAGGA 


AGCATACCTC 


960 




CCTGTCACCT 


GACTCTATTG 


AAGGCCAACT 


AATCTTAAAG 


GATAAGTTTA 


TCACTCAGTC 


1020 




AGCTGCAGAC 


ATTAGAAAAA 


ACTTCAAAAG 


TCTGCCTTAG 


GCCCGGAGCA 


GAACTTAGAA 


1080 




ACCCTATTTA 


ACTTGGCATC 


CTCAGTTTTT 


TATAATAGAG 


ATCAGGAGGA 


GCAGGCGAAA 


1140 




CGGGACAAAC 


GGGATAAAAA 


AAAAAGGGGG 


GGTCCACTAC 


TTTAGTCATG 


GCCCTCAGGC 


1200 


20 


AAGCAGACTT 


TGGAGGCTCT 


GCAAAAGGGA 


AAAGCTGGGC 


AAATCAAATG 


CCTAATAGGG 


1260 




CTGGCTTCCA 


GTGCGGTCTA 


CAAGGACACT 


TTAAAAAAGA 


TTATCCAAGT 


AGAAATAAGC 


1320 




CGCCCCCTTG 


TCCATGCCCC 


TTACGTCAAG 


GGAATCACTG 


GAAGGCCCAC 


TGCCCCAGGG 


1380 




GATGAAGATA 


CTCTGAGTCA 


GAAGCCATTA 


ACCAGATGAT 


CCAGCAGCAG 


GACTGAGGGT 


1440 




GCCCGGGGCG 


AGCGCCAGCC 


CATGCCATCA 


CCCTCACAGA 


GCCCCGGGTA 


TGTTTGACCA 


1500 


25 


TTGAGAGCCA 


A 










1511 



(2) INFORMATION FOR SEQ ID NO: 131: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 amino acids 
3 0 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 
35 Leu Giu Arg lie Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr Leu 

1 5 10 15 ■ 
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Arg Lys Lys Arg Phe lie Phe Phe Cys Ser Thr Ala Trp Pro Gin Tyr 

20 25 30 

Pro Leu Gin Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser lie Asn Tyr 
35 40 45 

5 Asn lie lie Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp 

50 55 60 

Ser Glu Val Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn Ser 
65 70 75 80 

Gin Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin Ser 
10 85 90 95 

Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr Asn 

100 105 110 

Lys Asp Pro Pro Leu Thr Gin Thr Val Gin Lys Glu lie Asp Lys Gly 
115 120 125 

15 Val Asn Asn Glu Pro Lys Ser Ala Asn lie Pro Arg Leu Cys Pro Leu 

130 135 140 

Gin Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val Pro 
145 150 155 160 

Phe Ser Leu Ser Asp Leu Lys Gin lie Lys He Asp Leu Gly Lys Phe 
20 165 170 175 

Ser Asp Asn Pro Asp Gly Tyr He Asp Val Leu Gin Gly Leu Gly Gin 

ISO 185 190 

Ser Phe Asp Leu Thr Trp Arg Asp He Met Leu Leu Leu Asn Gin Thr 
195 200 205 

25 Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu Phe 

210 215 220 

Gly Asp Leu Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr Glu 
225 230 235 240 

Glu Arg Thr Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp Pro 
30 245 250 255 

His Trp Asp Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His Leu 

260 265 270 

Leu Thr Cys Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro Met 
275 280 285 

35 Asn Tyr Ser Met Met Ser Thr He Thr Gin Gly Lys Glu Glu Asn Leu 

290 295 300 
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Thr Ala Phe Leu Asp 
305 

Leu Ser Pro Asp Ser 
325 

5 lie Thr Gin Ser Ala 

340 



194 

Arg Leu Arg Glu Ala Leu 
310 315 
lie Glu Gly Gin Leu lie 
330 

Ala Asp He Arg Lys Asn 
345 



Arg Lys His Thr Ser 
320 

Leu Lys Asp Lys Phe 
335 

Phe Lys Ser Leu Pro 
350 



(2) INFORMATION FOR SEQ ID NO: 132: 
(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

TGCTGGAATT CGGGATCCTA GAACGTATTC 



(2) INFORMATION FOR SEQ ID NO: 133: 
(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 30 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

AGTTCTGCTC CGAAGCTTAG GCAGACTTTT 



(2) INFORMATION FOR SEQ ID NO: 135: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 398 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

Met Gly Ser Ser His His His His His His Ser Ser Gly Leu Val Pro 
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15 10 15 

Arg Gly Ser His Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg 

20 25 30 

lie Leu Glu Arg lie Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr 
5 35 40 45 

Leu Arg Lys Lys Arg Phe lie Phe Phe Cys Ser Thr Ala Trp Pro Gin 

50 55 60 

Tyr Pro Leu Gin Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser lie Asn 
65 70 75 80 

10 Tyr Asn lie lie Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys 

85 90 95 

Trp Ser Glu Vai Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn 

100 105 110 

Ser Gin Leu Cys Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin 
15 115 120 125 

Ser Pro Pro Pro Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr 

130 135 140 

Asn Lys Asp Pro Pro Leu Thr Gin Thr Val Gin Lys Glu lie Asp Lys 
145 150 155 160 

20 Gly Val Asn Asn Glu Pro Lys Ser Ala Asn lie Pro Arg Leu Cys Pro 

165 170 175 

Leu Gin Ala Val Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val 

180 185 190 

Pro Phe Ser Leu Ser Asp Leu Lys Gin lie Lys lie Asp Leu Gly Lys 
25 195 200 205 

Phe Ser Asp Asn Pro Asp Gly Tyr lie Asp Val Leu Gin Gly Leu Gly 

210 215 220 

Gin Ser Phe Asp Leu Thr Trp Arg Asp lie Met Leu Leu Leu Asn Gin 
225 230 235 240 

3 0 Thr Leu Thr Pro Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu 

245 250 255 

Phe Gly Asp Leu Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr 

260 265 270 

Glu Glu Arg Thr Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp 
35 275 280 285 

Pro His Trp Asp Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His 
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290 295 300 

Leu Leu Thr Cys Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro 
305 310 315 320 

Met Asn Tyr Ser Met Met Ser Thr lie Thr Gin Gly Lys Glu Glu Asn 
5 325 330 335 

Leu Thr Ala Phe Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr 

340 345 350 

Ser Leu Ser Pro Asp Ser lie Glu Gly Gin Leu lie Leu Lys Asp Lys 
355 360 365 

10 Phe lie Thr Gin Ser Ala Ala Asp lie Arg Lys Asn Phe Lys Ser Leu 

370 375 380 

Pro Lys Leu Ala Ala Ala Leu Glu His His His His His His 
385 390 395 

15 (2) INFORMATION FOR SEQ ID NO: 137: 
<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 378 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
2 0 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 
Met Ala Ser Met Thr Gly Gly Gin Gin Met Gly Arg lie Leu Glu Arg 
15 10 15 

2 5 lie Leu Glu Asn Trp Asp Gin Cys Asp Thr Gin Thr Leu Arg Lys Lys 

20 25 30 

Arg Phe lie Phe Phe Cys Ser Thr Ala Trp Pro Gin Tyr Pro Leu Gin 

35 40 45 

Gly Arg Glu Thr Trp Leu Pro Glu Gly Ser lie Asn Tyr Asn lie lie 
30 50 55 60 

Leu Gin Leu Asp Leu Phe Cys Arg Lys Glu Gly Lys Trp Ser Glu Val 
65 70 75 80 

Pro Tyr Val Gin Thr Phe Phe Ser Leu Arg Asp Asn Ser Gin Leu Cys 
85 90 95 

3 5 Lys Lys Cys Gly Leu Cys Pro Thr Gly Ser Pro Gin Ser Pro Pro Pro 

100 105 110 
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Tyr Pro Ser Val Pro Ser Pro Thr Pro Ser Ser Thr Asn Lys Asp Pro 

115 120 125 

Pro Leu Thr Gin Thr Val Gin Lys Glu lie Asp Lys Gly Val Asn Asn 
130 135 140 

5 Glu Pro Lys Ser Ala Asn lie Pro Arg Leu Cys Pro Leu Gin Ala Val 

145 150 155 160 

Arg Gly Gly Glu Phe Gly Pro Ala Arg Val Pro Val Pro Phe Ser Leu 

165 170 175 

Ser Asp Leu Lys Gin lie Lys He Asp Leu Gly Lys Phe Ser Asp Asn 
10 180 185 190 

Pro Asp Gly Tyr He Asp Val Leu Gin Gly Leu Gly Gin Ser Phe Asp 

195 200 205 

Leu Thr Trp Arg Asp He Met Leu Leu Leu Asn Gin Thr Leu Thr Pro 
210 215 220 

15 Asn Glu Arg Ser Ala Ala Val Thr Ala Ala Arg Glu Phe Gly Asp Leu 

225 230 235 240 

Trp Tyr Leu Ser Gin Ala Asn Asn Arg Met Thr Thr Glu Glu Arg Thr 

245 250 255 

Thr Pro Thr Gly Gin Gin Ala Val Pro Ser Val Asp Pro His Trp Asp 
20 260 265 270 

Thr Glu Ser Glu His Gly Asp Trp Cys His Lys His Leu Leu Thr Cys 

275 280 285 

Val Leu Glu Gly Leu Arg Lys Thr Arg Lys Lys Pro Met Asn Tyr Ser 
290 295 300 

25 Met Met Ser Thr He Thr Gin Gly Lys Glu Glu Asn Leu Thr Ala Phe 

305 310 315 320 

Leu Asp Arg Leu Arg Glu Ala Leu Arg Lys His Thr Ser Leu Ser Pro 

325 330 335 

Asp Ser He Glu Gly Gin Leu He Leu Lys Asp Lys Phe He Thr Gin 
30 340 345 350 

Ser Ala Ala Asp He Arg Lys Asn Phe Lys Ser Leu Pro Lys Leu Ala 

355 360 365 

Ala Ala Leu Glu His His His His His His 
370 375 



35 



(2) INFORMATION FOR SEQ ID NO: 138: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 
CTTGGAGGGT GCATAACCAG GGAAT 25 

10 (2) INFORMATION FOR SEQ ID NO: 139: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 
TGTCCGCTGT GCTCCTGATC 20 

2 0 (2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 
CTATGTCCTT TTGGACTGTT TGGGT 25 

3 0 (2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 764 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 



TGTCCGCTGT 


GCTCCTGATC 


CAGCACAGGC 


GCCCATTGCC 


TCTCCCAATT 


GGGCTAAAGG 


60 


CTTGCCATTG 


TTCCTGCACA 


GCTAAGTGCC 


TGGGTTCATC 


CTAATCGAGC 


TGAACACTAG 


120 


TCACTGGGTT 


CCACGGTTCT 


CTTCCATGAC 


CCATGGCTTC 


TAATAGAGCT 


ATAACACTCA 


180 


CTGCATGGTC 


CAAGATTCCA 


TTCCTTGGAA 


TCCGTGAGAC 


CAAGAACCCC 


AGGTCAGAGA 


240 


ACACAAGGCT 


TGCCACCATG 


TTGGAAGCAG 


CCCACCACCA 


TTTTGGAAGC 


AGCCCGCCAC 


300 


TATCTTGGGA 


GCTCTGGGAG 


CAAGGACCCC 


AGGTAACAAT 


TTGGTGACCA 


CGAAGGGACC 


360 


TGAATCCGCA 


ACCATGAAGG 


GATCTCCAAA 


GCAATTGGAA 


ATGTTCCTCC 


CAAGGCAAAA 


420 


ATGCCCCTAA 


GATGTATTCT 


GGAGAATTGG 


GACCAATTTG 


ACCCTCAGAC 


AGTAAGAAAA 


480 


AAATGACTTA 


TATTCTTCTG 


CAGTACCGCC 


CTGGCCACGA 


TATCCTCTTC 


AAGGGGGAGA 


540 


AACCTGGCCT 


CCTGAGGGAA 


GTATAAATTA 


TAACACCATC 


TTACAGCTAG 


ACCTGTTTTG 


600 


TAGAAAAGGA 


GGCAAATGGA 


GTGAAGTGCC 


ATATTTACAA 


ACTTTCTTTT 


CATTAAAAGA 


660 


CAACTCGCAA 


TTATGTTAAC 


AGTGTGATTT 


GTGTTCCTAC 


ACGGAAGCCC 


TCAGATTCTA 


720 


CTCCCCACCC 


CCGGCATCTC 


CCCTGAATCC 


CTCCCCAACT 


TATT 




764 



15 

(2) INFORMATION FOR SEQ ID NO: 142: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 800 base pairs 

(B) TYPE: nucleotide 

20 (C) STRANDEDNESS : single 



(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 





TGTCCGCTGT 


GCTCCTGATC 


CAGCACAGGC 


GCCCATTGCC 


TCTCCCAATT 


GGGCTAAAGG 


60 


25 


CTTGCCATTG 


TTCCTGCACA 


GCTAAGTGCC 


TGGGTTCATC 


CTAATCGAGC 


TGAACACTAG 


120 




TCACTGGGTT 


CCACGGTTCT 


CTTCCATGAC 


CCATGGCTTC 


TAATAGAGCT 


ATAACACTCA 


180 




CTGCATGGTC 


CAAGATTCCA 


TTCCTTGGAA 


TCCGTGAGAC 


CAAGAACCCC 


AGGTCAGAGA 


240 




ACACAAGGCT 


TGCCACCATG 


TTGGAAGCAG 


CCCACCACCA 


TTTTGGAAGC 


GGCCCGCCAC 


300 




TATCTTGGGA 


GCTCTGGGAG 


CAAGGACCCC 


CAGGTAACAA 


TTTGGTGACC 


ACGAAGGGAC 


360 


30 


CTGAATCCGC 


AACCATGAAG 


GGATCTCCAA 


AGCAATTGGA 


AATGTTCCTC 


CCAAGGCAAA 


420 




AATGCCCCTA 


AGATGTATTC 


TGGAGAATTG 


GGACCAATCT 


GACCCTCAGA 


CAGTAAGAAA 


480 




AAAAATGACT 


TATATTCTTC 


TGCAGTACCG 


CCTGGCCACG 


GATATCCTCT 


TCAAGGGGGA 


540 




GAAACCTGGC 


CTCCTGAGGG 


AAGTATAAAT 


TATAACACCA 


TCTTACAGCT 


AGACCTGTTT 


600 




TGTAGAAAAG 


GAGGCAAATG 


GAGTGAAGTG 


CCATATTTAC 


AT^CTTTCTT 


TTCATTAAAA 


660 


35 


GACAACTCGC 


AATTATGTAA 


ACAGTGTGAT 


TTGTGTCCTA 


CAGGAAGCCC 


TCAGATCTAC 


720 




CTCCCTACCC 


CGGCATCTCC 


CTGACTCCTT 


CCCCAACTAA 


TAAGGACCCA 


CTTCAGCCCA 


780 
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AACAGTCCAA AAGGACATAG 800 

(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 

consensus (41/68-1 + 42/68-1 + cl43 68-1) 

(2) INFORMATION FOR SEQ ID NO: 170: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
<ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 
GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 
TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 
GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 
TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 
ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 
GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAGATAG CCAGACCATT AAATACACGA 360 
ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 420 
GCTTTCCAGG CCCTAAAG 438 

3 0 (2) INFORMATION FOR SEQ ID NO: 171: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
3 5 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: ADNc 



15 



20 



25 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

GACTTGAGCC AGTCCTCATA CCTGGACACT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 

TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 

GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 

TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 

ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 

GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAGTAG CCAGACCATT AAATACACGA 360 

ATTAAGGAAA CTCAAAAAGC CAGTACCCAT TTAGTAAGAT GGACACCTGA AGCAGAAGTG 400 

GCTTTCCAGG CCCTAAAG 438 



(2) INFORMATION FOR SEQ ID NO: 172: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 438 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 
GACTTGAGCC AGTCYTCATA CCTGGACAYT CTTGTCCTTC GGTACATGGA TGATTTACTT 60 

2 0 TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC AAGCACTCTT AAATTTCCTT 120 
GCTACCTGTG GCTACAAGGT TTCCAAACCA AAGGCTCAGC TCTGCTCACA GCAGGTTAAA 180 
TACTTAGGGC TAAAATTATC CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT 240 
ATACTGGGTT ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 300 
GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAAATAG CCAGACCATT AAATACACGA 360 

25 ATTAAGGAAA CTCAAAAAGC CAATACCCAT TTAGTAAGAT GGACATCTGA AGCAGAAGTG 400 
GCTTTCCAGG CCCTAAAG 438 



(2) INFORMATION FOR SEQ ID NO; 173: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

DLSQSSYLDT LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 



BNSDOCID:<WO 9823755A1> 



wo 98/23755 



PCT/IB97/01482 



202 

KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKIARPLNTR IKETQKANTH LVRWTPEAEV AFQALK 146 



(2) INFORMATION FOR SEQ ID NO: 174: 
5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
DLSQSSYLDT LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKVARPLNTR IKETQKASTH LVRWTPEAEV AFQALK 146 

15 

(2) INFORMATION FOR SEQ ID NO; 175: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 146 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 
DLSQSSYLDX LVLRYMDDLL LATHSETLCH QATQALLNFL ATCGYKVSKP 50 
2 5 KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 100 
GFCQIWIPRY SKIARPLNTR IKETQKANTH LVRWTSEAEV AFQALK 146 



(2) INFORMATION FOR SEQ ID NO: 176: 
(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

consensus ( 1/46-7+8/46-7+cl 5/46/7 ) 
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(2) INFORMATION FOR SEQ ID NO: 177: 
(1) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 429 base pairs 
5 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 
{ D ) TOPOLOGY : 1 inear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:177: 

10 GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCTTC AGTATGGGGA TGACTTAATT 60 
ATAGCCACCC ATTCAGAAAC CTTGTGGCAT CAAGCCACCC AAGCGCTCTT AAATTTCCTT 120 
GCTACCTGTG GCTCCAAACA AAAGGCTCAC CTCTGCTCAC ACCAGGTTAA ATACTTAGGG 180 
CTAAAATTAT CCAAAGTCAC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 240 
TATCCTCATC CCATAACCCT AAAGCAACTA AGAGGGTTCC TTGGCATATC AGCCTTCTGC 300 

15 CGAATATGGA TTCCCGGATA CAGTGAAATA GCCAGGCCAT TATGTACATT AATTAAGGAA 360 
ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 420 
GCCCTAAAG 429 

(2) INFORMATION FOR SEQ ID NO: 178: 
2 0 <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 5 ' (ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCTTC AGTATAGGGA TGATTTAATT 60 

ATAGCCACCC ATTCAGAAAC CTTGTGGCAT CAAGCCACCC AAGTGCTCTT AAATTTCCTC 120 

GCTACCTGTG GCTCCAAACA AAGGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 180 

3 0 CTAAAATTAT CCAAAGTCGC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGAT 240 

TATCCTCATC CCAAAACCAT AAAGCAACTA AGAGGGTTCC TTGGCATAAC AGCCTTCTGC 300 

CGAATATGGA TTCCCCGATA CAGTGAAATA GCCAGGCCAT TATGTACATT AGTTAAGGAA 360 

ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AGACAGAAGT GGCTTTCCAG 420 
GCCCTAAAG 



429 



35 



(2) INFORMATION FOR SEQ ID NO: 179: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 429 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS: single 



5 



(D) TOPOLOGY: linear 



(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 
GACTTGAGCC AGTCCTCATA CCTGGACATT CTTGTTCCTC AGTATGGGGA TGATTTAATT 
ATAGCCACCC ATTCAGAAAC CTTGTGGCAC CAAGCCACCC AAGCGCTCTT AAATTTCCTC 

10 GCTACCTGTG GCTCCAAACA AAAGGCTCAG CTCTGCTCAC AGCAGGTTAA ATACTTAGGG 
CTAAAATTAT CCAAAGTCAC CAGGGCCCTC AGAGAGGAAC GTATCCAGCG TATACTGGCT 
TATCCCCATC CCAAAACCCT AAAGCAACTA AGARGGTTCC TTGGCATAAC AGCCTTCTGC 
CGAATATGGA TTCCCAGATA CAGCGAAATA GCCAGGCCAT TATGTACATT ATCTAAGGAA 
ACTCAGAAAG CCAATACCCA TATAGTAAGA TGGACACCTG AAACAGAAGT GGCTTTCCAG 

15 GCCCT/^AAG 

(2) INFORMATION FOR SEQ ID NO: 180: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 
20 (B) TYPE: amino acid 



60 
120 
180 
240 
300 
360 
420 
429 



(C) STRANDEDNESS: single 



25 



(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 
DLSQSSYLDI LVLQYGDDLI lATHSETLWH QATQALLNFL ATCGSKQKAH 



50 



LCSHQVKYLG LKLSKVTRAL REERIQRILA YPHPITLKQL RGFLGISAFC 
RIWIPGYSEI ARPLCTLIKE TQKANTHIVR WTPETEVAFQ ALK 



100 



143 



30 



(2) INFORMATION FOR SEQ ID NO: 181: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
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DLSQSSYLDX LVLQYRDDLI lATHSETLWH QATQVLLNFL ATCGSKQRAQ - 50 
LCSQQVKYLG LKLSKVARAL REERIQRILD YPHPKTIKQL RGFLGITAFC 100 
RIWIPRYSEI ARPLCTLVKE TQKANTHIVR WTPETEVAFQ ALK 143 



5 (2) INFORMATION FOR SEQ ID NO: 182: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(ii) TYPE DE MOLECULE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 
DLSQSSYLDI LVPQYGDDLI lATHSETLWH QATQALLNFL ATCGSKQKAQ 50 
LCSQQVKYLG LKLSKVTRAL REERIQRILA YPHPKTLKQL RXFLGITAFC 100 
15 RIWIPRYSEI ARPLCTLSKE TQKANTHIVR WTPETEVAFQ ALK 143 

(2) INFORMATION FOR SEQ ID NO: 183: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
20 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 
25 GGCCAGGCAT CAGCCCAAGA CTTGA 25 

(2) INFORMATION FOR SEQ ID NO: 184: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 base pairs 
30 (B) TYPE: nucleotide 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 
3 5 TGCAAGCTCA TCCCTSRGAC CT 22 
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(2) INFORMATION FOR SEQ ID NO: 185: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleotide 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 
GACTTGAGCC AGTCCTCATA CCT 2 3 



(2) INFORMATION FOR SEQ ID NO: 186: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleotide 

15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
(ii) TYPE DE MOLECULE: ADNc 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 
CTTTAGGGCC TGGAAAGCCA CT 22 

20 
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CLAIMS 

1. Nucleic material, in the isolated or 
purified state, comprising a nucleotide sequence selected 

5 from the group including sequences SEQ ID NO: 93, SEQ ID 
NO: 94, their complementary sequences and their equivalent 
sequences, in particular nucleotide sequences displaying, 
for any succession of 100 contiguous monomers, at least 
50% and preferably at least 60% homology with said 
10 sequence SEQ ID NO: 93, SEQ ID NO: 94 and their 
complementary sequences, excluding HSERV-9 sequence. 

2. Nucleic material of claim 1, nucleotide 
sequence of which is selected from the group including 
sequences SEQ ID NO: 93, SEQ ID NO: 94, their complementary 

15 sequences and their equivalent sequences, in particular 
nucleotide sequences displaying, for any succession of 100 
contiguous monomers, at least 70% and preferably at least 
80% homology with said sequence SEQ ID NO: 93, SEQ ID NO: 94 
and their complementary sequences, 

20 3. Nucleic material, in the isolated or 

purified state, coding for any polypeptide displaying, for 
any contiguous succession of at least 30 amino acids, at 
least 50%, preferably at least 60 %, and most preferably 
at least 70% homology with a peptide sequence encoded by 

25 any nucleotide sequence selected from the group including 
SEQ ID NO: 93, SEQ ID NO: 94 and their complementary 
sequence, 

4. Nucleic material, in the isolated or 
purified state, of retroviral type, comprising a 

30 nucleotide sequence identical or equivalent to at least 
part of the pol gene of an isolated retrovirus associated 
with multiple sclerosis or rheumatoid arthritis, 

5, Nucleic material as claimed in claim 4, 
said nucleotide sequence being 80 % homologous to said at 

35 least part of the pol gene. 
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6. Nucleic material comprising a nucleotide 
sequence identical or equivalent to at least part of the 
pol gene of an isolated virus encoding a reverse 
transcriptase comprising an enzymatic site comprised 

5 between the amino acid domains LPQG and YXDD, said virus 
having a phylogenic distance with HSERV-9 of 0.063 ± 0.1, 
and preferably 0-063 ± 0,05. 

7. Nucleotide fragment comprising a nucleotide 
sequence selected from the group including SEQ ID NO: 93, 

10 SEQ ID NO: 94, their complementary sequences and their 
equivalent sequences, in particular nucleotide sequences 
displaying, for any succession of 100 contiguous monomers, 
at least 50% and preferably at least 60% homology with 
said sequences and their complementary sequences, said 

15 group excluding SEQ ID NO:l, and said nucleotide fragment 
not comprising nor consisting of the sequence HSERV-9. 

8. Nucleotide fragment of claim 7, nucleotide 
sequence of which is selected from the group including SEQ 
ID NO:93, SEQ ID NO: 94, their complementary sequences and 

20 their equivalent sequences, in particular nucleotide 
sequences displaying, for any succession of 100 contiguous 
monomers, at least 70% and preferably at least 80% 
homology with said sequences and their complementary 
sequences . 

25 9. Nucleotide fragment comprising a coding 

nucleotide sequence which is at least partially identical 
to a nucleotide sequence selected from the group 
including : 

SEQ ID NO: 93, SEQ ID NO: 94; their complementary 
30 sequences ; their equivalent sequences, in particular 
homologous to SEQ ID NO: 93, SEQ ID NO:94; 

sequences encoding at least part of the peptide 
sequence defined by SEQ ID NO:95; 

sequences encoding at least part of a peptide 
35 sequence equivalent, in particular homologous to SEQ ID 
NO: 95, which is capable of being recognized by sera of 
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patients infected with the MSRV-1 virus, or in whom the 
MSRV-1 virus has been reactivated. 

10. Nucleic acid probe for the detection of a 
virus associated with multiple sclerosis or rheumatoid 

5 arthritis, characterized in that it is capable of 
hybridizing specifically with any fragment according to 
any one of claim 7 to 9 . 

11. Probe as claimed in claim 10, consisting of 
between 10 and 1,000 monomers. 

10 12. Primer for the amplification by 

polymerization of an RNA or a DNA of a viral material 
associated with multiple sclerosis or rheumatoid 
arthritis, comprising a nucleotide sequence identical or 
equivalent to at least one portion of the nucleotide 

15 sequence of a fragment as claimed in any one of claims 7 
to 9 , in particular a nucleotide sequence displaying, for 
any succession of at least 10 contiguous monomers, 
preferably 15 contiguous monomers, more preferably 18 
contiguous monomers and even most preferably 20 contiguous 

20 monomers, at least 70% homology with at least the said 
portion of the said fragment. 

13. Primer as claimed in Claim 12, comprising a 
sequence selected from the group consisting of SEQ ID NO: 
99 to SEQ ID NO: 111. 

25 14. Polypeptide encoded by any open reading 

frame belonging to a nucleotide fragment as claimed in any 
one of claims 7 to 9 . 

15. Polypeptide of claim 14, characterized in 
that the open reading frame encoding it, is comprised, in 

30 the 5 '-3' direction, between nucleotide 18 and nucleotide 
2304 of SEQ ID NO: 93 . 

16, Polypeptide according to claim 15, 
comprising a peptide sequence at least partially identical 
to SEQ ID NO: 95. 

35 17. Polypeptide, comprising a peptide sequence 

at least partially identical to SEQ ID NO: 96. 
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18. Polypeptide of claim 17 exhibiting an 
enzymatic activity consisting of proteolytic activity. 

19. Polypeptide, characterized in that the open 
reading frame encoding it begins, in the 5 '-3' direction, 

5 at nucleotide 18 and ends at nucleotide 340 of SEQ ID 
NO:93. 

20. Polypeptide exhibiting an inhibitory 
activity on the proteolytic activity of polypeptide of 
claim 18. 

10 21. Polypeptide, comprising a peptide sequence 

identical or equivalent to SEQ ID NO: 97. 

22. Polypeptide of claim 21, comprising a 
peptide sequence identical or equivalent to SEQ ID NO: 98. 

23. Polypeptide, characterized in that the open 
15 reading frame encoding it begins, in the 5 ' -3 ' direction, 

at nucleotide 341 and ends at nucleotide 2304 of SEQ ID 
NO : 9 3 . 

24. Polypeptide, characterized in that the open 
reading frame encoding it begins, in the 5 '-3' direction, 

20 at nucleotide 1858 and ends at nucleotide 2304 of SEQ ID 
NO: 93 . 

25. Polypeptide of claim 21 or 23, exhibiting a 
reverse transcriptase activity. 

26. Polypeptide of claim 22 or 24, exhibiting a 
25 ribonuclease H activity. 

27. Polypeptide exhibiting an inhibitory 
activity on the reverse transcriptase activity of 
polypeptide of claim 25. 

28. Polypeptide having an inhibitory activity 
30 on the ribonuclease H activity of polypeptide of claim 26. 

29. Antigenic polypeptide recognized from the 
sera of patients infected with the MSRV-1 virus, and/or in 
whom the MSRV-l virus has been reactivated, characterized 
in that its peptide sequence is at least partially 

35 identical or is equivalent to a sequence selected from the 
group consisting of SEQ ID NO: 95, and fragments thereof, 
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in particular SEQ ID NO: 96, SEQ ID NO: 97 and SEQ ID NO: 
98 . 

30- Mono- or polyclonal antibody directed 
against the MSRV-1 virus, characterized in that it is 
5 obtained by the immunological reaction of a human or 
animal body or cells to an immunogenic agent consisting of 
an antigenic polypeptide of claim 29. 

31. Reagent for detection of the MSRV-1 
virus, or of an exposure to the said virus, characterized 

10 in that it comprises at least one reactive substance 
selected from the group consisting of a probe as claimed 
in claim 10 or 11 ; a polypeptide as claimed in any one of 
claims 14 to 29 ; or an antibody as claimed in claim 30. 

32. Diagnostic, prophylactic or therapeutic 
15 composition, in particular for inhibiting the expression 

of a virus associated with multiple sclerosis or 
rheumatoid arthritis, and/or the enzymatic activity of the 
proteins of said virus, said composition comprising a 
nucleotide fragment of any one of claims 7 to 9. 

20 33. Diagnostic, prophylactic or therapeutic 

composition comprising a polypeptide of any one of claims 
14 to 29, or an antibody of claim 30. 

34. Process for detecting a virus associated 
with multiple sclerosis or rheumatoid arthritis, in a 

25 biological sample, characterized in that an RNA and/or a 
DNA presumed to belong or originating from said virus, or 
their complementary RNA and/or DNA, is/are brought into 
contact with a nucleotide fragment according to any one of 
claim 7 to 9 . 

30 35. Process for detecting the presence or 

exposure to a virus associated with multiple sclerosis or 
rheumatoid arthritis, in a biological sample, wherein said 
sample is brought into contact with a polyeptide, 
according to any one of claim 14 to 29, or an antibody of 

35 claim 30. 



BNSDOCID:<WO 9823755A1> 



wo 98/23755 



PCT/IB97/01482 



Consensus 
Goosensus 



Ctansensos 
Oonseosus 



Oonsensus 
OoQsensus 



Ooosensus 



FIG. 1 

AlOTIOV-IC ICrnOSICA GGrmrTOOOC CAAGATtrm; 



85 



50 



so 



SLO ID N03 MSRv-m) 

QOCTGITCTC MMCTOGTVr AriClVWlCC TI03ST 86 

SEQ W /^OS^P^^ MSRV-IB) 

oocRGrnciY jmosTOcac «j!HJJ.'xuiCC uxiv^ 85 

S£0 tj) NOG (POL MSRV-IB) 



Oansensus 
Oonsensus 

Consensus 



RPCYCBAKM' YIBBCVCAVr ICKAKI^Sy FGSNMnCIB lOraCTKBGr 

/v9 A/^/^ ^POL MSRV-IB) 



BNS0OCID:<WO 9823755A1> 



wo 98/23755 PCT/IB97/01482 



FlG-2 

CONSENSUS A seq id no 3 



TACOCATACCCC TCATCTCTTTCCrCA CCTAaCC™ TCTACCCC.CrTC.C e« 
^ - \ \ \ \- \ 'o \ \ A Q 0 L 0 U . S_ 



GTTT/ 



ATAGCrGGACACrcr TGrCCTTCGCT 
I PGHS CPS 
Y L 0 T L V L R 
TWT LL SFG 



S C I A P 

ATACCrGGACACTCT TGTCCTTCAG 
IPGHS CPS 
YLOTL VLQ 
TWTLL S F 

CONSENSUS D seq id no 6 



85 



AGGTCCAGGCACTCT GTTCCTTCAG 
RSRHS VP. S 
GPGTL FLQ 

Vv. V. v.Vv. V. V. V. ^ % 



86 



85 



^CAOCCATAOaC CCATCTATrrCCCCT CCCATTMCCCCACA CTTAACCCACTTCTC 60 

FROSS HLFGL a l ASSH 
SGIAP lYLAW H.PET .Ab 

85 

ATACGTGGACACTCT TGTCCTTTGG 

IRGHS CPL 
Y V 0 T L V L W 

TWTLL S r 
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Fig. 6 

CAAGCCACXX: AAGAACICTT AAATTTCCIX:- ACT^^rTCTC GTEACAAOGT 50 

TICCAAACCA AA03CTCAGC TZTGCICACA GGAG^TTAGA TACTIAGGGT 100 

TAAAATTATC CAAAGGCACX: AGGGQOCTCA GTGA3GAAQG TATOCAQZCT 150 

ATACIGGGTT ATCCTCATCC CAAAACOCTA AAOZAACTAA GAGGGTIDCrr 200 

TAGCA1GA1C AGGTnCTOC OGAAAACAAG ATTOGCAGGT ACAAQCAAAA 250 

TAGCCAGAOC ATmTAmZA CmATEAAG:; AAACTCAGAA AGOCAATMC 300 

mrriAGiAA gatogacacx: tt^aacagaag gcttiocagg ooct?w^gaa 350 

G90CCTAACC CAAGDOOCAG TSTXCAGCTT GCXIAACAGGG CAAGATTnT 400 

CITmTAIG3 CACAGAAAAA ACAGGAATGG CiClAGGAGT CCTIACACAG 450 

GTCC53AG3GA TGAGCTIQCA ADCXDGTOSIiA -mXTGAATA AGGAAATIGA 500 

TCXAGTOGCA AAGQGnGQC CTCATiaGTIT ATGQGTAA1G aa39ZAG37^ 550 

CAGICTN?CT A'ICIGAAGGA GITAAAATAA TACA39GAAG AGAlCnNIT 600 

GTCIGGACAT CTCAIGATGT GAAa33CA'IA CTCACTOCTA AAGGAG?CTT 650 

GTOGTIGTCA GACAAOCATT TOCTIJ^ANIA 'ICAQ3CICTA TIACITCAAG 700 

AGOCAGTOCT G^^3ACTGCX33 ACriUlGCAA CICirAAAO: C 741 

S£^2 (PSJ 17) 
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me 



TCAGGGATAGCCCCCATCTATTTGGCCAGGCATTAGCGCAAGACTTGAGTC 

AATTCTCATACCTGGACACTCTTGTCCTTCAGTACATGGATGATTTACTIT 

TAGTCGCCCGTTCAGAAACCTTGTGCCATCAAGCCACCCAAGAACTCTTAA 

CTTTCCrCACTACCTGTGGCTACAAGGTTTCCAAACCAAAGGCTCGGCTCT 

GO-CACAGGAGATTAGATACTNAGGGCTAAAATTATCCAAAGGCACCAGG 

GCCCTCAGTGAGGAACGTATCCAGCCTATACTGGCTTATCCrCATCCCAAA 

ACCCTAAAGCAACTAAGAGGG-nrCCTTGGCATAACAGGTTTCTGCCGAAA 

ACAGATTCCCAGGTACASCCCAATAGCCAGACCATTATATACACTAATTA 

NGGAAACTCAGAAAGCCAATACCTAnTAGTAAGATGGACACCTACAGAA 

GTGGCTTTCCAGGCCCTAAAGAAGGCCCrAACCCAAGCCCCAGTGTTCAGC 

TTGCCAACAGGGCAAGATTTTTCrn-ATATGCCACAGAAAAAAC^^ 

AGCTCTAGGAGTCCTTACGCAGGTCTCAGGGATGAGCTTGCAACCCGTGGT 

ATACCTGAGTAAGGAAATTGATGTAGTGGCAAAGGGTT 



SEQ ID NO 3 (M003-P00^) 



FIG. 7 
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FlG.13 

SEQ ID NO 46 (FBd3) 
GTGCTGATTGGTGTATTTACAATCCTTTATCTAATCCGAAATGCCCATGTTG 

?Iatatggaaagaaagggag 

5acJacaagtaaatcatggagttattgcacacagtgcaaaaactcaagga 

GGTGGAAGTCnTACACTGCCAAAGCCATCAGAAAAGGGAA 

caSagcataagtggctacagaggcaaggaaagactagcagaaaggaaa 

GAGAGAAAGAGA^^^ 

ca^gagggagtcagagagagagagagacagagagtcagagagaaggaa 

ag^Sggaagagacaaagaatgaatcaaa^^^^^^^^^^ 
cagagagagagagagagaggaagagacagagaaaaagagggagtcagaa 

aSagaccaaagaagaagtccaaag/^aaga^^^^^^^ 

tagtaaaggaaaaacagtgtaccctattcctitaaaagc^^^ 

aaaacctataattgataactgaaggtcitctctgtaacc^^^^ 

AATACCACCrrGTTGTCAAGTGTAAACAAGGGCGTAGCCCAAAAGCACTG 

aggcSctS^caacccatagccttcctatcaa^^^^ 

TrrrCTAACAGGGGATCTAAATCTTAATTAATrACCATACAATGGTCCAAC 
S^AOTAGGA^^^^ 

gg^Sagggagaaagacacaatgggtattcagt^^^ 

ArArTTGTAGAAGCAAAGTrAGGAAAATTGCCAAATAATTGGTTTGCTCAA 

gS???gcact^gccaaaccttgaagtacitgca^^ 

GCCATCrATACCAATTCTAAGTTAATATGGACTQAAGGA^^ 
ACCAAAGAGAAATTAAAATCCCAAACrTATAAGGTrrrCAACCAAAGTAA 

A^^'CT/S^GTrAACAGCGTAACATGTA™ 

rrCAAAGGATITCrCAGACAGTITGCAAGAAATAATGATATCTATCCTTAC 

S^tcSaS^agactcittggcagcagtgactct^ 

ggaagagtgttgtctttacactaaccagtcagggatag^^^ 

ccggcatitacagaaaaaggcttctgaaatc/^ca^^ 

ctataccaacctctggagttgggcaacatggtxtotc^^ 

ATGGCTGCCATCTTGCTATTACTCGCCTTTGGGCCCTG-TATm 
TTGTCAAATXrGTITCTrCTAGGATCGAGGCCATC/^GCTA^^^ 
ACAAATGGAACCCCAAATGAGCTCAACTATCAACTrCTACTGAGGACCCCT 

aSac^Iac^ccctggccc^^ 

CACTACCACTGCAGGGCCCCATCTTTGCCCCTATCCAGAAGGAAGTAGCTA 

GSTc5^?ScCCAATrCCCAAGAGCAGCrGGG^^^^ 

GGGGATTGAGAGGTGAAGCCAGCTGGACTTCTGGG^^^^^^^ 
GAGAACmTGTGTCTAGCTAAAGGATTGTAAATGCAACAATCAGTGCTCT 

GTGTCTAGCTAAAGGATTGTAAATACACCAATCAGCAC 
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SEQ ID NO 51 ( tpol) 



GGCTGCTAAAGGAGACTTGTGGTrGTCAGACAATCGCCTACTTAGGTACCA 

GG^^^^ACITGAGGGACTGGTGCTTCAGATGCGCACI^^ 

TAACCCAAAGTrATGCTGCCCAGAAGGATCTITTA^ 

ACCCTGACCrCAACCTATATATATACTGATGGAAGITCGTTTGTAGAAAAG 

GGATTACAAAGGGNAGGATATNCCATAGGTTAGTGATAAAGCAGTACTTG 

AAACTAAGCCTCrrCCCCCCAGGGACCAGCGCCCCCGTrAGCAGAACTAGT 

GGCACTGACCCCGAGCCTTAGAACTTGGAAAGGGAGGAGGATAAATGTGT 

ATACAGATAGCAAGTATGCTTATCTAATCCGAAATGCCCATGTTG 
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m3 



SEQ ID NO 52 (JLBcl) 
TCAGGGATAGCCCCCATCTATTTGGTCAGGCACTGGCCCA^^^^^^ 
rATGCCACTTTTAAGAGCCATTTCTCAAGTCCAGGTACTCTGGTCCiTCGGT 
CAIOt-i-A^^i „_^.^^.^^^A^TAr^r-("xrATGCCAGCAGG 



rn-A(^rTAGATCTCrTGAACTTrCTAGCTAATCAAGGGTACAAGGCATCTA 

^g^S^Iggcc?:^^ 

??G?^?5^S?ATGCT^^^ 

?A?^rTCTrAGAGTACCCTrAGCrAATCCTGACCTTAACCT^^ 

atcgaISS^^g^Sgaaa^^^ 
^^agtcSStaatcataotgcaagt^^ 

J^rA??^ISAGAACrAGTCACACrTACOT^ 

^IaI^gt^S^vaat^ 

a??ccca??aaataccacaaggyaaatca^^ 

aaaaactS^agg^^^^^ 

?aa^gSaggggaga^^ 

?acSg2^Sagagaaggagagagac^^ 

?Ar?ASSAGACAGAGAGGAAGAGACAGAGAGACAGT^^ 

agacagagagagS^gag 
gSgagaccaaggagtccnaga^^^ 

I1Ic^S:1ac^gata^ 

^r^rr^TACCCCCITGTmAGTGTGAACG^^^ 
A?r?ArACCTCTGAS^^ 



aaaS^a?CTAGG^^^ 
?rrAGG?Gl?^AAASAAAATAAAAAGACACATGGGCAGCCAGTAA 

aacggaac^Sactag^ 
?AA??IIl^AAA?rrfACT^^ 

^^aSacaSctctcanaggattcctcagacagtitacaagaaat^^^ 

STCTA?CTGCT^GGATAGTAACrACAATCCCAAATACATrC^^ 



CAGTGACTCTC 

FIG.16 
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SEQ lO NO 53 (JLBc2) 
TCAGGGATAGCCCCC.vTCTAl-TTGATCAGGCACTAGCCCAAGATCTAGGCC 
ACTTrTGAAGTCCAGGCATTCTAGTCC-n-CAGTATGTGGATGATTTACTrTT 
GGCTACCAGTrrGGAAGCCTCATGCCAGCAGGCTACTTGAGATCTCTTGAA 
Cl-rrcrAGCTAATCAAGGGTGTATGGCATCTAAATTGAAAGTCCAGCTCTG 
CCTACAACAAGTCAAATATCTAGGCCTAATCTTAGATAGAAGAACCAGGG 
CCCTCAGCAAGGAATGAATAAAGCCTATGCTGGCTTATCGGCACCCTAAGA 
CATTAAAACAATTGTGGGGGTTCCTTGGAATCACTGGCTTTTGCCGACTAT 
GGATCCCTGGATAGAGTGAGATAGCCAGGCCCCCTCrATTACTCTTATCAA 
GGAGACCCAGAGGGCAAATACTTATCTAGTATTATGGGNACCAGAGGCAG 
AAAAAGCCTrCCAAACCTTAAAGGAGACCCl-AGTACAAGCTCCAGCTTTAA 
GCCTTCCCACAGGACAAANCTTCTCTTTATATGTCACAGAGAGAGCAGGAA 
TAGCrCCrGGAGTCCTTACrCAGACTTTTGGACGACCCCACGGCCAGTGGC 
RTACCTAAGTAAGGAAATTGATGTAGTAGCAAAAGGCTGGCCTCACTGTTT 
ATGGGTAGTTGCGGCrGTGGCAGTCTTACTGTCAAAGGCTATCAAAATAAT 
ACAAGGAAAGGAnTCACTATCTGGACTACTCATGAGGAAAATGGCATATT 
AGGTGCCAAAGGAAGTrrTTGGCTATCAGACAACCACCTGCTCAGATTCCA 
GGCACTACTGATTGAGAGACCAGTGCnTAAATATGTATGTGTGTGTGTGG 
rCCTCAACCCTGCCACTGTTCTCCCAGAAGATGGAGAACCAATGAAGCATr 
ACTGTCAACAAA1TAGAGTCCAGAGTTATGCTGCCTGAGAGGATCTCTTAG 
AAGTCCCCITAGCrAATCCTGACCTTAACCTATATGCTGATGGAAGTTCAC 
TTGTGGAGAATGGGATACGAAAAGCACATTATGCCATAGTrAGTGAGGTA 
ACAGTACrrGAAAGTAAGCCTATTCCCCCATGGACCAGAGCCCAGTTAGCA 
GAACrAGTGGCACTTACCCAAGCCTTAGAACTAGGAAAGGGAAAAATAAT 
AAATGTGTATACAGATAGCAAGTATGCTTATCTAATCCTACATGCCCATGC 
TGCAGTATGGAAAGAAAGGGAGTTCCTAACCTCTGGGGGAACCCCCATO 
AATACCACAAGGCAAATCATGGAGTTATTGCATGTAGTGCAAAACCTCAA 
GTAGGTGGCAGmTACACTGCCTGAAGCTATGGGGAAGGAGAGAGGAGA 
ACAGCAGCATAAGTGGCTAGCAGAGGCAGCGAAAGACTAGCAGAGAGGA 
GAGGTAGGGGAAAGACAGAAAGTCAAAGAAAAGAAGTCAAAGACAGACA 

GAGAAAGAGACAGAGGGAGCCAGAGAGAAAG/u*^A^^^ 

GACAGAATGTCAAAGAACAGAAGAGAGAGGCAGCGCCAG/UVGA 

AAAG^GAGAAAGAGAGATGGAAATAGTAAAGAAAAAACAGT^CTACCCT^^ 

TCOTAAAAGCCAGGGTAAATITAAAACGTATAATITTATAATrC^^ 

TCirCTCCATAACCCrATAACATTAAAATACCACCTTGTTGTCAGTGIi^^ 

AAGAGCATAGCCCAAAAGCACTGAGGCCACTGACAACCCATAGCCTTCCT 

ATCAAA^ATCCITAACTCTGCAGGTITCCTAACAGGGGATCTAAATCTCAA 

CTAATCACCATACAATGGTCCGACCAGACCTAGGAGCGACTCCCCrCAGG 

ACAGAAGGATGGATGGTTCCTCCCAGGCCATTAAGGGAAAGAGACACAAT 

GGG?A^?CAGTAAGrGATAAGGGAACrcnTGTAGA^^^ 

GCCTAATATTTGGTCTGCTCAAATGTGCCAGCTGTTrGCACTCAGCTAAAC 

CrrAAATTACrrACAGAATrAGGAAGGAGCCATCTATACCAATTCTGAGTT 

SJlTCAGCTSuVCAAGTrCITATTAATAGCAAAGAATC 

AAOTG^AAAGTTrrCAACAAAAGTAAAGTTTGCTGAAAGTTAGCAGT^^^^ 

AC^TCTATTATCCTAACnTCTAATCTrGTGGAAATCAGACCCTAT^^ 

CCCTCAAAGCTGAAGTCCATCAGCATATGGCCATACAACTAATACCCCTAT 

5???AGGG?S^GGAATGGCCACTGCrACAGGAATGGG^^ 

CrACTrCATrATCCTATrACCACACACTCTTAAAGGATTTCrCAGACAGnT 

ACAAGAAATAACAAAATCTATCC^ 
GGCAGCAGTGACTCTC 

FIG. 17 
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1 TTCCTGAGTT CTTGCACTAA CCTCAAATGA 

61 GTTTGGCGAT CCCTGGTATC TCAGTCAGGT 

121 ATGATTCCCC ACAGGCCAGC AGGCAGTTCC 

181 KQhKCKl^A GATTGGTGCC GCAGACATTT 

241 AACTAGGAAG ATATGAATTA TTCAATGATG 

3 01 TCCTACTGCC TTTCTGGAGA GACTAAGGGA 

361 CATTGGAGGC TCTGGAAAAG GGAAAAGTTG 

421 CCAGTGTGGT CTACAAGGAC ACTTTAAAAA 

481 CGTCCATGCC CCTTATGTCA AGGGAATCAC 

541 TCCTCTGAGT CAGAAGCCAC TAACCAGATG 

601 CAAGCGCCAG CCCATGCCAT CACCCTCACA 

661 CAGAAGGGTA CTGTCTCCTG GACACTGGCG 

721 ACAACTGTCC TCCAGATCTG TCACTGTCCG 

781 CTTCTCCCAG CCACTAAGTT GTGACTGGGG 

841 ATGCCTGAAA GCCCCACTCT CTTGTTAGGG 

901 TATACATGTG AATATAGGAG AAGGAACAAC 

961 TAATCCTGAA GTCCGGGCAA CAGAAGGACA 

1021 TCAAGTTAAA CTAAAGGATT CCACCTCCTT 

1081 CGAGACCCAA CAAGAACTCC AAAAGATTGT 

1141 ACCAAGCAAT AGCCCTTGCA AGACTCCAAT 



GAGAAGTGCC GCCATAACTG CAACCCAAGA 
CAATGACAGG ATGACAACAG AGGAAAGATA 
CAGTGTAGAC CCTCATTAGG AGACAGAATC" 
GCTAACTTGC GTGCTAGAAG GACTAAGGAA 
TCCACTATAA CACAGGGGAA AGGAAGAAAA 
GGCATTGAGG AAGCATACCA GGCAAGTGGA 
GGAAAAGTAT ATGTCTAATA GGGCTTGCTT 
AGATTGTCCA ATAGAAATAA GCGACCACCT 
TGGAAGGCCC ACTGCCCCAG GGGATGAAGG 
ATCCAGCAGC AGGACTGAGG GTGCCCGGGG 
GAGCCCGAGG TATGCTTGAC CATTGAGGGT 
GGCCTTCTCA GTCTTACTTT CCTGTCCTGG 
AGGGGTCCTA GGACAGCCAG TCACTAGATA 
AACTTTACTC TTCCACATGC TTTTCTAATT 
GAGAGACATT CTAGCAAAAG CAGGGGCCAT 
TGTTTGTTGT CCCCTGCTTG AGGAAGGAAT 
ATATGGACAA GCAAAGAATG CCCGTCCTGT 
TCCCTACCAA AGGGAGTACC CCCTCAGACC 
AAAGGACCTA AAAGCCCAAG GCCTAGTAAA 
TTTAGGAGTA AGGAAACCCA ACGGAC 
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GATGCCTTTTTCTGCATCCCTGTACGTCCTGACTCTCAATTCTTGTTTGCCTTTGAAG 

ATCCTTTGAACCCAACGTCTCAACTCACCTGGACTGTTTTACCCCAAGGGTTCAGGGA 

TAGCCCCATCTATTTGGCCAGGCATTAGCCCAAGATGCCTTTTGCATCCCTGTACGTG 

ACTCTCAATTCTTGTTTGCCTTTGCCTTTGAAGATGCTTTGAACCCAACGTCTCAACT 

CACCTGGACTGTTTTACGCCAAGGGTTCAGGGATAGCCCCCATCTATTTGGC 

CAGGCATTAGCCCAA ^EQ ID NO 4 0 



Asp-Ala-Phe-Phe-Cys-Ile-Pro-Val-Arg-Pro-Asp-Ser-Gln-Phe- 
Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn-Pro-Thr-Ser-Gln-Leu- 
Thr-Trp-Thr-Val-Leu-Pro-Gln-Gly-Phe-Arg-Asp-Ser-Pro-His- 

Leu-Phe-Gly-Gln-Ala-Leu-Ala-G In 

SEQ ID NO 39 (POL2B) 
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FIG- 34 



Cys 



-Ile-Pro-Val-Arg-Pro-Asp-Ser-Gln-Phe-Leu SEQ ID NO 41 



Val-Leu-Pro-Gln-Gly-Phe-Arg-Asp-Ser-Pro-His-Leu-Phe-Gly- 

SEQ ID NO 4 2 

Gln-Ala-Leu-Ala 



Leu-Phe-Ala-Phe-Glu-Asp-Pro-Leu SEQ ID NO 43 

Phe-Ala-Phe-Glu-Asp-Pro-Leu-Asn SEQ ID NO 44 
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10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 

CITCXrCZ^ TAMIAAGGAC CXXXXTTICA I^COOPAKJ^ TCCAAAAQGA 50 
LPQL IRT PL S TQTV QKD 
FPN . .GP PFQ PKQ SKRT 
SPT NKD PPFN PNS PKG 



GATAGACAAA QGAGTAAACA ATGAACXZAAA G?^GIQCCAAT AITCOCIQCT 100 
I D K G V N N EPK SAN IPWL 
. TK E,T MNQR VPI FPG 
HRQR SKQ .TK ECQY SLV 

TAIGGAODCT CCAAGOOGIG GGAGAAGAAT TCGGCXXIAQC CAGAGIQCAT 150 

CTL QAV GEEF GPA RVH 
YAPS KRW EKN SAQP ECM 
MHP PSGG RRI RPS QSAC 

GEACumrr CTCICICACA CTTGAAQCAA ATIAAAATAG JOTIAGGINA 200 
VPFS LSH LKQ IKID XGX 
YLF LSHT .SK LK. T.VN 
TFF SLT LEAN .NR XRX 

ATINICAGAT A300CTGAIG GYTAEATIGA. TLJi'i'i'iACAA QQATEAGGAC 250 
XSD SPDG YID VLQ GLGQ 
XQI ALM XILM FYK D.D 
IXR. P .W LY. CFTR IRT 

AAICCrTIGA TCTGACATQG AGAGATAIAA TATEACIQCr AAAXCAGAOG 300 

S F D L T W R D I I L L L N Q T 
NPLI .HG EI. YYC. IRR 
IL. SDME RYN ITA KSDA 

CIAADCTCAA AIGAGAGAAG TQCTOCTAIA ACTGGAGQGC GAGAGi'i'iUG 350 
LTSN ERS AAI TGAR EFG 
. PQ MREV LP. LEP ESLA 
NLK .EK CCHN WSP RVW 

CAATCrCCIQG TATCTCAGIC AGGICAATCA TAGGAIGACA AOGGAQGAAA 400 
NLW YLSQ VND RMT TEER 
ISG ISV RSMI G. Q RR K 
QSLV SQS GQ. .DDN GG^'K 

GAGAAGGATT CCXDCACAGOG CAGCA^^ 450 

ERF PTG QQAV PSV APH 

E N D S PQG SRQ FPV. LLI 

RTI PHRA AGS SQC SSSL 

TGQGACACAG AAICAGAACA TQGAGATIQG TGQOGCAGAC ATITA 495 
WDTE SEH GDW CRRH L 
GTQ NQNM EIG AAD I 
GHR IRT WRLV PQT F 
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Fl(^ '^^ 

I OLJ 123456789Q 123456789Q 1234567890 1 ?3456789Q 123456789Q 

CTTCCOCAAC TTATAMGAC CQXCri'ICA AOCCAAACAG TCCAAAftG2?iL 50 
LPQL IRT PLS TQTV QKD 

CATCAGACAAA QG^^jTAAACA ATGAACXilAAA G^ySIOXAAT MTOCCIQCT 100 
IDK GVNN EPK SAN IPWL 



TATQCAOOCT OCAAGOQOTG OSAGAAGAAT Ta33GCC?OC CA£S^CT3CAT 150 
CTL QAV GEEF GPA RVH 

Gr?OnTnT CICTCICACA CTTGAAOZAA ATTAAAATAG ACCTAGGTAA 200 
VPFS LSH LKQ IKID LGK 

Aricic;^T ;y30cciGAax^ gytaomtga TOrrmcAA ggateaggac 250 

FSD SPDG YID VLQ GLGQ 

AATOCITIGA TCICACATOG AGAGATATAA TAITACKrT PM^ICU^^ 300 
SFD LTW RDII LLL NQT 

CTAACCrCAA ATGAGAGAAG TOCTOOCASA ACIOG?OOCC GMAGITIG3 350 
LTSN ERS AAI TGAR EFG 



CAATCICr33 miCICAGIC AGJTCAATCA T?03A1G^ A03G?^3GAAA 400 
NLW YLSQ VND RMT TEER 

GAGAADGATT OCXXIACAGGG C^^GCAGQC?^ TTOX^CTGr AQCTOCriCAT 450 
ERF PTG QQAV PSV APH 

T03GACACAG AADXI^GAACA TGG?^3A3TOG TO3D3CAGAC ATmCAACr 500 
WDTE SEH GDW CRRH LQL 

TOOGIQC32AN AAOGACINf^ GAAAACTOQG AAGACIA^GA A2TAriCAAN 550 
ACX KDXG KLG RLX IIQX 

GAlOrcCACT ANt^CACAGS GGAAAGSA;^ AAAAaTTTAC TOCrTTICIG 600 
CPL XHR GKEE NPT AFL 

G^iGAGACEAA Qa3AG3CATr GAGC^s^^^aCAT ACC^^SGCAAG TOGACATIGG 650 
ERLR EAL RKH TRQV DIG 

i^33CICTGGA AAAG33AAAA GTIGGQCAAA TEATAJOOT AATAG3CCTr 700 
GSG KGKS WAN YMP NR-AC 

CJCnUCAGIG CAGICTACAA QGA03CnTIA GAAAAGATIG TIXAAOTAGA 750 
FQC SLQ GRFR KDC PSR 

AAIAAGOOCX: OlXJiOTOJCA TGCXJLVriOT GICAAGGGAA TCACI03AAG 800 
NKPP I>VH APY VKGI TGR 



QCCIACTQDC CX:?G333AOG AMGTIOCTCr GAGICAGAAG OIACIAAOCT 850 
PTA PGDE GPL SQK PLT. 

852 
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AAOGAAACIC iO^AZOXAA. TACXXMT^ 50 
KETQ KAN THL VRWT PEA 
RKL RKPI PI. .DG HQKQ 
GNS ESQ YPFS KMD TRS 

AGAAQCAGCr TICCAGGOCX: TAAflGAAATC OCnAADGCAA GOOQCSGIGT 100 
EAA FQAL KKS LTQ APVL 
KQLSRP.RNP.PKPQC 
RSSF PGP KEI PNPS PSV 

TAAGCnOCC AACGQGGCAA GALTi'l'lUiT TAaATOICAC iO^AAAACSG 150 

SLP TGQ DFSL YVT EKQ 
.ACQ RGK TFL YMSQ KNR 
KLA NGAR LFF ICH RKTG 

GAM1AGC7ICT A33AGraCTr ACACAGGIO: AMGGACAAG CTIGCAACCT 200 
E.L. ESL HRS KGQA CNL 
NSS RSPY TGP RDK LATC 
lAL GVL TQVQ GTS LQP 

GK3GCATADC TGAGTAAQGA AATIGAIGEA. NIGQCAAAGG GnOQQCICA. 250 
WHT .VRK LMX WQR VGLI 
GIP E.G N.CX GKG LAS 
VAYL SKE TDV XAK G WPH 

TIGTTEACftG GEM3GQCaGC AGI30ZfiGrC TEAGITTCIG AAACAGITAA 300 

VYR .GS SSSL SF. NS. 
LFTG RAA VAV LVSE TVK 
CLQVGQQ -QS .FLKQLK 

AATAATACSG GGAAGAGAIC TEACIGIGIG GACATCTCAT GftTCTGAAOG 350 
NNTG KRS YCV DIS. CER 
IIQ GRDL TVW TSH DVNG 
.YR EEI LLCG HLM M.T 

QCATACICAC TGCEAAAGftG GACTIGIQGC IGTCfiGACAA QC2aTIlACIT 400 
HTH C.RG LVA V RQ PFT.; 
ILT AKE DLWL SDN HLL 
AYSL LKR TCG CQTT lYL. 

AAATfiQCAQG TTCTAITACr TGAAGIGGCA GIGC7IOGGAC 1GCACA3TIG 450 

lAG SIT .SAS AAT AHL 
K.QV LLL EVP VLRL HIC 
NSR FYYL KCQ CCD CTFV 

IQGAACICIT AAOXSGCXZA C MTiL ' i ' ilJC ASACAAIGAA GAAAAGAIAG 500 
CNS - PSHISSRQ.RKDR 
ATL NPAT FLP DNE EKIE 
QLL TQP HFFQ TMK KR. 
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r ICjUO AAC?^T?^ACIG TCAACAAGTA ATTQCTrCAAA aTTAIGCra: TOGAQQQGAC 550 
1-^ T.L STSN CSN LCC SRGP 
^ HNC QQV lAQT YAA RGD 
NITV NK. LLK PML'L' EGT 



CnCIAGAaG TiaXTIGAC TGATOCXXiAC CTCAACTIGT AEACTGATOG 600 

SRG SLD .SRP QLV Y.W 
LLEV PLT DPD LNLY TDG 
F.R FP.L IPT STC ILME 

AAGTIOCTIG GGAGAAAAAG GACTTIGAAA A3CX3QQGrAT QCACTGATCA 650 
KFLG RKR TLK SGVC SDQ 
SSL AEKG L.K AGY AVIS 
VPW QKK DFEK RGM Q.S 

GIGAEAATQG AATACTIGAA ACTAATOGQC TCACIOCAQG AACJCAGIGCT 700 
. .W NT.K .SP HSR N.CS 
DNG ILE SNRL TPG TSA 
VIME YLK VIA SLQE LVL 

GACCT3GCAG AACTAAT?^ OTICACTIQG QCACIAGAAr TAGGAGAAGG 750 

PGR TNS PHLG TRI RRR 
HLAE LIA LTW ALEL GEG 
TWQ N. .P SLG H.N .EKE 

AAAAAQQGIA AAEATATATT CAGACIUTAA GIATGCITAC CTAGTIXXTCC 800 
K K G K . Y I F R _,.L . . V C L P . S P . P 
KRV NIYS DSK YAY LVLH 
KG. lYI QT LS MLT .SS 

ATOOXATGC AGCAATATQG AGAGAGAQQG AATTOCTAAC TICIGAGQGA 850 
CPC SNME REG IPN F.GN 
AHA AIW RERE FLT SEG 
MPMQ QYG ERG NS.L LRE 

ACADCEATCA AOCATCAGQG AAGGCATEAG GAGAIITAnA TIGGCTGEAC 900 

TYQ PSG KPLG DY Y WLY 
TPIN HQG SH. EI I I GOT 
HLS TIRE AIR RLL LA VQ 

AGAAAOOTAA A3AGGTOQCA GICTEACACT QQCAQGGKA TCAQGAAGAA 950 
RNLK RWQ SYT ARVI RKK 
ET. RGGS LTL PGS SGRR 
KPK EVA VLHC QGH QEE 

O^GGAAAGQG AAAniAGAAQG CAATOXCAA QGQGAEATIG AAQCAAAAAA 1000 
RKG K.KA lAK RIL KQKK 
GKG NRR QSPS GY. SKK 
EERE lEG NRQ ADIE AKK 



BNSDOClD:<WO 9823755A1> 



• 



wo 98/23755 



PCT/IB97/01482 



M/69 



FIG38 



c 



10 



20 



30 



40 



50 



i::!34B6789Q 1234567890 1234567890 1234567B Q0 1234567890 

AGODQCAAGG CAGGACTCTC CPSTPGAM^ GCTEATAGAA Q3AaXCT?!G 1050 

PQG RTL H.KC L.K DP. 
SRKA GLS IRN AYRR TPS 
AAR QDSP LEM LIE GPLV 

TA.TGG3CTAA TCCXXTCIGG GAAACXIAAGC GCC»GrACIC i^3CAGGAAAA 1100 
YGVI PSG KPS PSTQ QEK 
MG. SPLG NQA PVL SRKN 
WGN PLW ETKP QYS AGK 

AEAGAATSGG AAACCICACA AGGACATACT ITCCTOOOCT GCAGATQGCT 1150 
.NR KPHK DIL SSP PDG. 
R I G N L- T R. . T Y^- F .PPL Q M A 
lE.E TSQ GHT FLPS RWL 

^OXACIGAG GAftGGftA 

P L R K E 
S H . G R 
ATE EG 
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AACTIGCX3IG CI»GA?03AC TAAQGAAAA^ 50 
NLRA RRT KEN .EDY ELF 
TCV LEGL RKT RKT MNYS 
FIG 39 LAC .KD .GKL GRL .II 

Q CAATCATCIC CACTATAACA CAGGGGAAAG GAAGAAAATC CTACTGCETT 100 
NDV HYNT GER KKI LLPF 
MMS TIT QGKG RKS YCL 
Q.CP L.H RGK EENP TAP 

TCIQGAGAGA CTAAQQGAGG CATIGAGGAA QC^OS^^^ 150 

WRD .GR H.GS IPG KWT 
SGET KGG lEE AYQA SGH 
LER LREA LRK HTR QVDI 

TIQGS09CIC TG3AAAAQGG AAAAGTIGQG CAAATIGAAT GOCTAATMG 200 
LEAL EKG KVG QIEC LIG 
■ WRL WKRE KLG KLN A. .G 
GGS GKG KSWA N.M PNR 

Q L 'i' HJ L TiC C AGIQC3«3IUr ACAAGGACX3C TTTAGAAAAG AnOTOCAAG 250 
LAS SAVY KDA LEK IVQV 
LLP VQS TRTL .KR LSK 
ACFQ CSL QGR FRKD CPS 

TAGAAAIAAG CCXSCQOCTOG TOCAIQOGOC TimUiCAAG QGAAaX3^CIG 300 
EIS RPS SMPL MSR ESL 
K.A APR PCP LCQG NHW 
RNK PPLV HAP YVK GITG 

GAAaSOOTAC TOOCOCAGGG GADGAAGOrC CICIGACTCA GAAGOCACTA 350 
EGLL PQG TKV L.VR SH. 
KAY CPRG RRS SES EATN 
RPT APG DEGP LSQ KPL 

ACCIGATCAT CCAGCAGCAG GACIGAGOCT QOOOaQOGCA AGIGOCAQOC 400 
PDD PAAG LRV PGA SASP 
LMI QQQ D.GC PGQ VPA 
T. -S SSR TEG ARGK CQP 

CATQCCAICA CGCTCSGAQC CGOaOGEAIG TTIGADCATr GAGAGGCfiGG 450 

CHH PQS PGYV .PL RAR 
HAIT LRA PGM FDH. EPG 
MPS PSEP RVC LTI ESQE 

AAGITAACIG TCICCIQGAC ACTQaOGCAG CLTiCiCAGT CTTACnTQC 500 
KLTV SWT LAQ PSQS YFP 
S.L SPGH WRS LLS LTFL 
VNC LLD TGAA FSV LLS 
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^\C^ "^O TOTOQCftGAC AAncrroCIC CAGATCIGrC i^MCOGAG GCCTOTiaAG 550 

1 lO VPD NCPP DLS LSE GS.D 

b SQT IVL QICH YPRGPK 

CPRQ LSS RSV TIRO VLR 

ACAQGCAGIC ACTACATACT TCTCTCfiGOC ACIAAGTIUr GftCIGGGGftA 600 

SQS LHT SLSH .VV TGE 
TASH YIL LSA TKL. LGN 
QPV TTYF SQP LSC DWGT 

CTTIACICrr T7ICACAIGCT TTICIAATEA TQCXTIGAAAG aXJCACTCCC 550 
LYSF HML F.L CLKA PLP 
FTL FTCF SNY A.K PHSL 
LLF SHA FLIM PES PTP 

TKHTAGGGA GSGACAinT AGCAAAAQCA G^GGCCATm TftCACXnCAA 700 
C.G ETF. QKQ GPL YT.T 
VRE RHF SKSR GHY TPE 
LLGR DIL AKA GAII HLN 

CATftOGAAAA QGAAIAOOCA TTiaCIGia: CXTOCTIGAG GAAGGAATm 750 

.EK EYP FAVP CLR KEL 
H RKR NTH LLS PA.G RN. 
IGK GIPI CCP LLE EGIN 

imXriGAACT Cnm3C3^ATA GAAGGACAAT ATOGACA?^ 800 
ILKS GQ. KDN MDKQ RMP 
S.S LGNR RTI WTS KECP 
PEV WAI EGQY GQA KNA 

a-XCCIGTIC AfiGITAAACr AAAQGATTCT QCCIGCITIC CXTMXAAAG 850 
VLF KLN. RIL PPF PTKG 
SCS S.T KGFC LliS LPK 
RPVQ VKL KDS ASFP YQR 

GAAGEAOXT CTEAGACGOG AQGQOCTACA AGGACICAAA AGAa'iuriAA 900 

STL LDP RPYK DSK DC. 
EVPS -TR GPT RTQK IVK 
KYP LRPE ALQ GLK RLLR 

QGACCIAAAA GCCX^^AGOCC TAGEAAAADC AIGCAGEAGC OGCIGCAATA 950 
GPKS PRP SKT MQ.P LQY 
DLK AQGL VKP CSS PCNT 
T.K PKA ..NH AVA PAI 

ciccAAnrr aogageaagg aaacqcaadg gacactggag gitagiqcaa looo 

SNF RSKE TQR TVE VSAR 
PIL GVR KPNG QWR LVQ 
LQF. E.G NPT DSGG .CK 
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C GAICICAQGA TEATEAATGA. QGL'iUi'i'i'lT CCTCTAaCAOC CaOCTGfEATC 1050 

SQD Y. . GCFS SIP SCI 
DLRI INE AVF PLYP AVS 
ISG LLMR LFF LYT QLYL 

TAQCQCnAT ACICIGCTIT CCXTEAATCADC AGAQGAAGCA GAGmJi'i'iA 1100 
. PLY SAF PNT RGSR VVY 
SPY TLLS LIP EEA E.FT 
ALI LCF P.YQ RKQ SSL 

CAGTOCTQSA QCITAMGAT GCCICITICT GCATCGCIGT AC2^TCCIGftT 1150 
SPG P. GO LFL HPC TS.F 
VLD LKD ASFC IPV HPD 
QSWT LRM PLS ASLY ILI 

TCicAATicr T x^ri ' iur crr tcaaga2CCt tigaaoqcaa igicicaatt 1200 

SIL VOL .RSF EPN VSI 
SQ FL FVF EDP LNPM SQF 
LNS CLSL KIL .TQ CLNS 

CACCTOSACT GITTEAOGCX: AGGGGTTOOG GGATAGCCXX: CMCEATTIG 1250 
HLDC FTP GVP G.PP SIW 
TWT VLPQ GFR DSP HLFG 
PGL FYP RGSG lAP lYL 

GGCAG3CArr AGQOCAAGAC TIGSGQCAAT TCICATACCT QGACATCTIG 1300 
PGI SPRL EPI LIP GHLV 
QAL AQD LSQF SYL DIL 
ARH. PKT .AN SHTW TSC 

TCCTTOQGEA TOOGAIGATT TAATmSGC CACXXGITCA GAAACXTITCr 1350 

LRY GMI .F.P PVQ KPC 
SFGM G.F NFS HPFR NLV 
PSV WDDL ILA T RS ETLC 

GCCAICAAGC CAGCX^^AQOG TICTTAAA3T TOCTCACTOC GICT3QCEAC 1400 
AIK'P PKR S.I SSLR VAT 
PSS HPSV LKF PHS VWLQ 
HQA TQA F LNF LTP CGY 

AAGGTITCCA AAOCAAAGOC TCAQCTCIGC TCACAGCS^ TIAAATACIT 1450 
RFP NQRL SSA HSR LNT. 
GFQ TKG SALL TAG .IL 
KVSK PKA QLC SQQV KYL 

^GQGITAAAA TEATOCAAAG QCADCAQGQC (XTCIGIGAG GAAaXSEATQC 1500 

G.N Y P K A P G P S V R N V S 
RVKI IQR HQG PL.G MYP 
GLK LSKG TRA LCE ECIQ 
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AAOCTOEACr GOCTIMUrr CATCOCAAAA CXXTAAAGCA ACTAAGAAQG 1550 
NLYW LIF IPK P.SN .EG 
TCT GLSS SQN PKA TKKV 
PVL AYL HPKT LKQ LRR 

TCnriQGCAT AACMGITIC TQCXDGAA 1577 
PWH NRFL P 
LGI TGF CR 
SLA.QVSAE 
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TCX::MCMC?^ QGACIG?V3QG TOCXXX^QQQC AZ^GIGC?CAGC CXMOXATC 50 
SSSR TEG ARG KCQP MPS 

XXXTCAG?^ COOQQQCTAT GITIGACJCAT TGAGAGGCAG GAAdETAACT 100 
PSE PRVC LTI ESQ EVNC 

GTCTOCTOGA CACTQGCX^CA GCCTICICAG TCTEACTTTC CnGTCOCAGA 150 
LLD TGA AFSV LLS CPR 

CAATIGTOCr QCAGMCIGT GACTAIiOaGA QOOGIOCrAA GACMOGAGT 200 
QLSS RSV TIR GVLR QPV 

CACTACAEAC TICIUICAQC CACEAAGITG TGACIQGQGA ACTITACICr 250 
TTY FSQP LSC DWG TLLF 

TITCACAIQC TITICIAATr ATGOCTGAAA QCmiilACrOC CriUi'iAGGG 300 
SHA FLI MPES PTP LLG 

AS?OOmrD TAQCAAA^iGC AGGGQQCAIT ATACAOTIGA ACATAGGAAA 350 
RDIL AKA GAI IH L*N I G K 

A3GAATACaC ATrrOCTGTC COJiUL'i'iGA GGAAGGAATT AATOCTGAZ^G 400 
GIP ICCP LLE EGI NPEV 

TCTOGOCAAT AGAAGGACAA TAIGGACAAG CAAAGAAIGC OOCTCXTIGIT 450 
WAI EGQ YGQA KNA RPV 

CAAGTEAAA:: TAZ^AQGATTC TGCXiTiaTITr (XCIAOCAAA QGA^^GEACOC 500 
QVKL KDS ASF PYQR KYP 

TCTTAGADCX: GAQQCQCIAC AAQGACICAA AAGAi'lUi'iA A3GADCT 547 
LRPEALQGLKRLLRT 
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WAX EGQ YGQA KNA 



CfifiCrnJKANZ TRAWXPtnC TXCraCTTT aC3CTA3aM 
QVKL KDS ASF PYQR 



TCTTflGvai: a^amr^c apggwciea aa«iattcit 

LR? EALQ GXQ KIV 
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AQG LVK PCSS PCN 
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TPT EVAF QAL KKA 
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ATGAICCAQC AQCAQGAOSIG AQOGUGCODG aOQCAAQCXX: CAGCa::ATOC 50 
MIQQ QDX GCP GQAP AHA 



CATCADCCTC ACAGAGOQCC AGCTATOCIT GAOCATIGAG QGrECAGAAGG 100 
ITL TEPQ VCL TIE GQKG 



GINACIGTCT a::TOGACACT QGOGCS^JXCT TXIDCAGICIT ACTTTaCICT 150 
XCL LDT GGAF SVL L SC 



CCTGGACAAC TGIOTrOCAG ATCTGTrCACT GICGGAQOaG TCCTAQGACA 200 
PGQL SSR SVT VRGV LGQ 



GOCAGICACr AGATACTICT CQCAQCXIACr AAGTICTGAC TOQGGAACTT 250 
PVT RYFS QPL SCD WGTL 



mnUITGQC ACMQCmT CEAATTAIGC CIGAAAQOQC CACICICTIG 300 
LFP HAF LIMP ESP TLL 



TTQGGGAGAG ACATICTAGC AAAAXAQQG QOCATIIATAC ATOIGAATAT 350 
LGRD ILA KAG AIIH VNI 



^^GGAGAAQGA ACAACTGITr GITCIOXUr QCTIGAGGAA QGAATEAATC 400 
GEG TTVC GPL LEE GINP 



CIGAAGTQOG GGCAACAGAA QGACAATAIG GACAZ^GCAAA GAAIGOQO?! 450 
EVR ATE GQYG QAK NAR 



TTAAACTAAA QGATTOCACC TOCTTTOOCT AGCAAMGCA 500 
PVQV KLK DST SFPY QRQ 
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GTACOCXJCIC AGACOOGAGA CCICAACAAGA. ACTOCAAAAG ATIGTAAMG 550 
YPL RPET QQE LQK IVKD 



ACCTAAAAQC CT^^AGGOCTA GTAAAAOGAA GCAATAGCCC TTGCAAGACT 600 
LKA QGL VKPS NSP CKT 



QCAATmAG GACTAAGGAA AOCTAAOQGA CAGTOGAGGT TAGIQCAAGA 650 
PILG VRK PNG QWRL VQE 



ACTCAGGATT ATCAAIGAGG CIGTlGi'iCX: TCTATACOCA GCTGTAOCIA 700 
LRI INEA VV P LYP AVPN 



AOQCITATAC AGIGCTITCC CAAATADCAG AQGAAQCAGA GIGGnTACA 750 
PYT VLS QIPE EAE WFT 



GTOCIGGADC TTAAQGATQC C'l'i'i'i'iCIGC ATOQCIGTAC GIOnGACIC 800 
VLDL KDA FFC IPVR PDS 



TCAATICTIG TTTOQCTITG AAGAICCTIT GAACGCAAOG TCTIGAACICA 850 
QFL FAFE DPL NPT SQLT 



OCTGGACTGT TTEACOCXIAA QQGITCAGQG AimOOOCCA TCTATTIGGC 900 
WTV LPQ GFRD SPH LFG 



CAGQCATTAG COCAAGACIT GAGTCAATTC TCATACCTQG ACACTCTIGr 950 
QALA QDL SQF SYLD TLV 



QOTTCAGEAC ATOGATGATT TACTTTIMJr CGC^QOGITCA GAAACCTIGT 1000 
LQY MDDL LLV a'rS ETLC 
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GOCATCAPDC CACXXIAAGAA CICTTAACIT TCXiTCACTAC CIGIO^CTAC 1050 
HQA TQE LLTF LTT C GY 



AAOGrrETOCA. AACCAAAQQC TOOQCrCTCC TC?003AG?^ TTAGATACIN 1100 
KVSK PKA RLC SQEI RYX 



AQQQCrAAAA TEATCGAAAG GGZ^CGAGQQC CXTICZ^GIGAG GAAOGrEATOC 1150 
GLK LSKG TRA LSE ERIQ 



AGCXTIATACr GGCTIAroCT GATOQCAAAA OXTAAAGCA ACTAAGAQGG 1200 
PIL AYP HPKT LKQ LRG 



TICCTIGQGA TAM:AGGnT CTOCX^GAAAA CAGATiaXA QGEACASCXr 1250 
FLGI TGF CRK QIPR YXP 



AATAGOCAGA CCmTATATA CACTAATTAN QGAAACTCAG AAMCTAATA 1300 
lAR PLYT LIX ETQ KANT 



ccmrrvNiT az^tggaca cctacagaag TOOcrrrocA ooaxriAAAG 1350 

YLV RWT PTEV AFQ ALK 



AAGQQOCTAA COCAAQQQCC ACTGTICAGC TIGOiAACAG GQCAAGATIT 1400 
KALT QAP VFS LPTG QDF 



TICrrmTAT GOCACAGAAA AAACAGGAAT AXTCTAGGA GICCTEADQC 1450 
SLY ATEK TGI ALG VLTQ 



AGGICICMG GATGAGCriG CAACOOGIGG TftTACXTCGAG TAAGGAAATT 1500 
VSG MSL QPVV YLS KEI 
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GATEGTAGIGG CAAAGQCTIG GCHICAIISICT TmTOQCTAA TaaM3QCACT 1550 
DVVA KGW PHX LWVM XAV 



GTATUIGAAG CAGITAAAAT AATACAQQGA ^OOVICTIN 1600 
AVX VSEA VKI IQG RDLX 



CICTCTQGAC ATCICATGAT GIGAAa3QCA TACTSRCIGC TAAAQGAGZ^C 1650 
VWT SHD VNGI LXA KGD 



TKJDQGTIGr CAGACAAOCA TITACTTAAN TAYGAGQCYY TATIACTIGA 1700 
LWLS DNH LLX YQAL LLE 



AGAGOCAGIG CTOSIGACTGC QCACTTGnX: AACTCITAAA CCCAAACTEA 1750 
EPV LXLR TCP TLK PKLM 



TOCIGCXXAG AAGGATCirr NEAGAGGIOC OCTEAGQCAA CCX7IGACCTC 1800 
LPR RTF XEVP LAN PDL 



AACTAIEATAT ATACTCATOG AACITCGITr GTAGAAAAGG GATTACAAAG 1850 
NYIY TDG SSF VEKG LQR 



QOSIAaGATAT MXATAQGIG TTAGIGATAA AGCAGTACIT GAAAGTAAGC 1900 
XGY XIGV. SDK AVL ESK P 



CTcrrcrax: ccagggacca ooGOCXDaaGr tagcagaactt agiggcacig 1950 
LP P qgp appl ael val 



ACXXXirX^ OCHTAGAACr TUQGAAAQQG AGGAQGATAA AIGICTATAC 2000 
TPRA LEL WKG RRIN VYT 
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AGATAGCAAG TMQCTIATC TAAICOGAAA IQCXXATCTT GCAATMQGA 2050 
DSK YAYL IRN AHV AIWK 



AAGAAAQQGA GrnCCIAACX: TCIQGGQGAA CmXATTAA ATAOZACAAG 2100 
ERE FLT SGGT PIK YHK 



TTAATCATQG AGITATIQCA CACAGTOCAA AAACICAAQG AGGIQGAAGT 2150 
LIME LLH TVQ KLKE VEV 



CITACACIGC CAAAGCCyflC AGAAAAGOGA AAGAGQQGAA GAGCAGCAIA 2200 
LHC QSHQ KRE RGE EQHK 



AGIQGCJTACA GfiGQCAAQGA AAGACTAGCA GAA^GGAAAG AGAGAAfiGAG 2250 
WLQ RQG KTSR KER EKE 



ACAGAAAGTC AGPCAGAGAG AGAQGAAGAG ACAGAQCACA AAGAGGGAGT 2300 
TESQ RER EEE TEHK EGV 



CAGAGAGAGA GAGAGACAGA GAGTCAGAGA GAAGGAAAGA GAGfiGAQGAA 2350 
RER ERQR VRE KER ERG R 



GAGACAAAGA ATCA 2364 
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I/:SrQVKYLG LKLSK\MlAL REERIQRILfV [YPHJ^lpiJKQL FHFLGlffiFC 
LCSIQVKYLG LKLSKVPRAL REERIQRILD YPHPc<IIKQL R3FTjGIIAPC 
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RIWIEFYSEI 
RIWIP = YSEI 

PTWTtQyQP'T 



ARPLCTLIKE TQKANTHIVR WTPETEVAFQ ALK 
ARPLCTLi^KE TQKANTHIVR WTPETEVAFQ ALK 
ARPLCTL = KE TQKA^7^HIVR WTPETEVAFQ ALK 
appT/^]krT7 ^Tfa^T^m^m y^rrTiwmn^Trn ar.v 
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DLSOSSYLclr 

nT.cr|CCVT. 



LVLRYMDDLL LATHSETLCH QATQALI^JFL ATCGYKVSKP 
LVLRYMDDLL LATHSETLCH QATQALL2JFL ATCGYKVSKP 
LVLRYMDDLL LATHSETLCH QATQALUJFL ATCGYKVSKP 
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KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 
KAQLCSQQVK YLGLKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 
KAQI/rSQQVK YIX3LKLSKGT RTLSEERIQP ILGYPHPKTL KQLTAFLGIT 
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3ACTTGAGCC AGTCITCATA 
SACTTGAGCC AGTC i TCATA 
GACTTtSAGCC AG'ixit^^' 



cctggaca:t 

CCTGGACAiT 
lTA CXriGGACAir 



CTTGrCCTTC GGTACATGGA 
CITGTCCTrC GGTACATGGA 
CTTOTCCrrC ggtacatgga 



Vtr rTTTTTrrTTT fYTrAPATYr^Ai 



TCAnTACIT TTAGCCACCC ATTCAGAAAC CTTGTGCCAT CAAGCCACCC 
IGATTTACrr TTAGCCACCC ATTCAGAAAC CTTGT GCCAT CAAGCCACCC 
IGATTTACTT TTAGCCACCC ATTCAGAAAC CTTCTCCCAT CAAGCCACCC 
Tparr>rrrar*T^ ^prRr yv&rw^ aTrTv^anaaar^ ro^rrmryv^aT raaryY^aryv- 



AAGCACTCTT AAATTTCCTT GCTACCTGTG GCTACAAGGT TTCCAAACCA 
AAGCACTCrr AAATTTCCTT GCTACCTGTG GCTACAAGGT TTCCAAACCA 
AAGCACTCTT AAATTTCCTT GCTACCTGTG GCTACAAGGT TTCCAAACCA 
aarir^arTV'^ aan M^i'i^ s "it ry^arv^^Tvyrr: ry^anAaryrp T^TYvaaannA 
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41/68-1 propre 
cl43 propre 68-1 
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AAGGCTCAGC TCTGCTCACA GCAGGTTAAA TACTTAGGGC TAAAATTATC 
AAGGCTCAGC TCTGCTCACA GCAGGTTAAA TACTTAGGGC TAAAATTATC 
AAGGCTCAGC TCTGCTCACA GCAGGTTAAA TACTTAGGGC TAAAATTATC 
aaryy^ajTr Tmy y^aoa ry^anfrm^aaa TAr^Afyyy TAARATTATg 
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CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTI 
CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTT 
CAAAGGCACC AGAACCCTCA GTGAGGAACG TATCCAGCCT ATACTGGGTI 
r^aaaryjrarv^ anA &rw^A CTnamanrv:: ^TATW^Aryy^ an^ar^TYyyyrn- 



ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 
ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGC GTTC CT TGGCATAACA 
ATCCTCATCC CAAAACCCTA AAGCAACTAA CAGCGTTCCT TGGCATAACA 
a^TYV^^aaYv^ oaaaannrT>a aanrAAn^paa r^aryvmw^ Tryy^aTOanA 



GGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAtSteAG 
QGTTTCTGCC AAATATGGAT TCCCAGGTAC AGCAAWTAG 
QGTITCTGCC AAATATGGAT TCCCAGGTAC AGCAApCpAG 



CCAGACCATT 
CCAGACCATT 
CCAGACCATT 

hrivhn rv^ariarv-an^ 



250 
250 
250 

250 



300 
300 
300 

300 



350 
350 
350 

350 



41/68-1 propre 
cl43 propre 68-1 
42/68-1 propre 

Consensxjis 



AAATACACGA ATEAAGGAAA CTCAAAAAGC 
AAATACACGA ATTAAGGAAA CTCAAAAAGC 
AAATACACGA ATTAAGGAAA CTCAAAAAGC 
aaarpar'ar r'a a>prpAarir!aaa r^aaaAary" 



CAPTACCCAT 
CAaTACCCAT 
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TTAGTAAGAT 
TTAGTAAGAT 
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qgaca:ctga agcagaagtg gctttccagg ccctaaag 

3GACA1CTGA AGCAGAAGTG GCTTTCCAGG CCCTAAAG 
GGACA : CTGA AGCAGAAGTG GCTTTCCAGG CCCTAAAG 
rynanaUrrca ary p/^aaryTY^ nrtmryry^Tinn. fW^AAAr: 
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Consensiis 
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cons AEN 1,5,8 
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MSKV pol 
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ATTATGCCTG AAAGCCCCAC TCCCTTOTTA GOGAGAGACA TTTEAGCAAA 
ATTATGCCTG AAAGCCCCAC TCCCrroiTA GGGAGAGACA TTITAGCAAA 
AGCAGGGGCC ATTATACACC TGAACA13UQG AAAAGGAATA CCCATTTGCT 
AGCAGGQGCC ATTATACACC TGAACATAGG AAAAGGAATA CCCATTTGCT 
GTCCCCTGCT TGAGGAAGGA ATTAATCCTG AAGTCTOGGC AATAGAAGGA 
GTCCCCTGCT TGAGGAAGGA ATTAATCCTG AAGTCTOGGC AATAGAAGGA 
CAATATGGAC AAGCAAAGAA TGCCCGTCCT GTTCAACTTA AACTAAAGGA 
CAATATGGAC AAGCAAAGAA TGCCCGTCCT GTTCAAGTTA AACTAAAGGA 
TTCT G CCTCC TTTCCCTACC AAAGGAAGTA CCCTCTEAGA CCCGAGGCCC 
TTCTGCCTCC TTTCCCTACC AAAGGAAGTA CCCTCTTAGA CCCGAGGCCC 
TACAAGGANC TCAAAAGATT GTTAAGGACC TAAAAGCCCA AGGCCTAGrTA 
TACAAGGANC TCAAAAGATT GTrAAGGACC TAAAAGCCCA AGGCCTAGTA 
AAACXZATGCA GTAGCCCCTG CAATACTCCA ATTTTAGGAG TAAGGAAAOC 
AAACCATGCA GTAGCCCCTG CAATACTCCA ATi'lTAGGAG TAAGGAAACC 
CAACX3GACAG TQGAGGTEAG TGCAAGATCT CAGGATTATT AATGAGGCTG 
CAACGGACAG TOGAQGnTAG TGCAAGATCT CAGGATEATT AATGAGGCTG 

' rrri ' iccTCT atacccagct gtatctaqcc cteatactct gctttcccta 
rrrrix-vivr atacccagct gtatctagcc cttatactct gctitcccta 
ataccagagg aagcagagtg gtteacagtc ciggacctta aqgatgcctt 
ataccagagg aagcagagtg gtitacagtc ctogacctta aggatgcctt 

TITCTQCATC CCTGTACGTC CTGACTCTCA ATTCTTC3TTT GCCITIUAAG 
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CAACTCACCT GGACTGTTrr ACCCCAAOGG 
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dCTCTGCTCA CA 3C AC o TTA 
SCTCTGCTCA CA3CAG:ITA 



^otrxA. 



i^TACTTAGG QCTAAAATTA TCCAAAC 
fiATACTTAQG QCTAAAATTA TCCAAAC 



CCAGGGCCCT CAGI3AGGAA CGTATCCAGC 
CCAGGGCCCT CAGPGAGGAA CGTATCCAGC 



: rATACTOG : 
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ITATCC I CAT 
TTATCC i CAT 




c^: [taaagcaact aaga : ggttc cttg gcata \ 

TAAAGCAACT AAGA?QGTTC CTTGGCATA/^ 
TAArifY^nnPT nrfr^^^^^^^^ rnnraTrftT&lT 




icCGAAHftMG ATTCCCtQlr ACApCCdAAT AGCCAG^CCA TTAT^^TACA: 
CXXSAAIftrCG ATTCCCo3 = T ACA3YG?AAT AGCCAG:CCA TTAT^rACAr 
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